nips nips2011 nips2011-209 knowledge-graph by maker-knowledge-mining

209 nips-2011-Orthogonal Matching Pursuit with Replacement


Source: pdf

Author: Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Abstract: In this paper, we consider the problem of compressed sensing where the goal is to recover all sparse vectors using a small number offixed linear measurements. For this problem, we propose a novel partial hard-thresholding operator that leads to a general family of iterative algorithms. While one extreme of the family yields well known hard thresholding algorithms like ITI and HTP[17, 10], the other end of the spectrum leads to a novel algorithm that we call Orthogonal Matching Pursnit with Replacement (OMPR). OMPR, like the classic greedy algorithm OMP, adds exactly one coordinate to the support at each iteration, based on the correlation with the current residnal. However, unlike OMP, OMPR also removes one coordinate from the support. This simple change allows us to prove that OMPR has the best known guarantees for sparse recovery in terms of the Restricted Isometry Property (a condition on the measurement matrix). In contrast, OMP is known to have very weak performance guarantees under RIP. Given its simple structore, we are able to extend OMPR using locality sensitive hashing to get OMPR-Hasb, the first provably sub-linear (in dimensionality) algorithm for sparse recovery. Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursnit. We provide experimental results on large problems providing recovery for vectors of size up to million dimensions. We demonstrste that for large-scale problems our proposed methods are more robust and faster than existing methods.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract In this paper, we consider the problem of compressed sensing where the goal is to recover all sparse vectors using a small number offixed linear measurements. [sent-7, score-0.168]

2 For this problem, we propose a novel partial hard-thresholding operator that leads to a general family of iterative algorithms. [sent-8, score-0.168]

3 While one extreme of the family yields well known hard thresholding algorithms like ITI and HTP[17, 10], the other end of the spectrum leads to a novel algorithm that we call Orthogonal Matching Pursnit with Replacement (OMPR). [sent-9, score-0.267]

4 OMPR, like the classic greedy algorithm OMP, adds exactly one coordinate to the support at each iteration, based on the correlation with the current residnal. [sent-10, score-0.114]

5 This simple change allows us to prove that OMPR has the best known guarantees for sparse recovery in terms of the Restricted Isometry Property (a condition on the measurement matrix). [sent-12, score-0.284]

6 Given its simple structore, we are able to extend OMPR using locality sensitive hashing to get OMPR-Hasb, the first provably sub-linear (in dimensionality) algorithm for sparse recovery. [sent-14, score-0.199]

7 Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursnit. [sent-15, score-0.088]

8 We provide experimental results on large problems providing recovery for vectors of size up to million dimensions. [sent-16, score-0.155]

9 Compressed sensing, a new and active branch of modem signal processing, deals with the problem of designing measurement matrices and recovery algorithms, such that almost all sparse signals can be recovered from a smalI number of measurements. [sent-22, score-0.229]

10 In this paper, we focus on the compressed sensing setting [3, 7] where we want to design a measurement matrix A E R=xn such that a sparse vector x* E R n with Ilx*llo := IBUpp(X*)I ::; k < n can be efficiently recovered from the measurements b = Ax* E R=. [sent-24, score-0.277]

11 Initial work focused on various random ensembles of matrices A such that, if A was chosen randomly from that ensemble, one would be able to recover all or almost all sparse vectors x* from Ax*. [sent-25, score-0.115]

12 Candes and Tao[3] isolated a key property called the restricted Isometry property (RIP) and proved that, as long as the measurement matrix A satisfies RIP, the true sparse vector can be obtained by solving an i,-optimization problem, min Ilxll, S. [sent-26, score-0.205]

13 It was shown in [2] that i,-minimization recovers all k-sparse vectors provided A satisfies t. [sent-31, score-0.204]

14 Note that, in compressed sensing, the goal is to recover all, or most, k-sparse signals using the same measurement matrix A. [sent-35, score-0.123]

15 Hence, weaker cooditioos such as restricted coovexity [20] studied in the statistical literature (where the aint is to recover a single sparse vector from noisy linear measurements) typically do not suffice. [sent-36, score-0.134]

16 In fact, if RIP is not satisfied then multiple sparse vectors x can lead to the sante observatioo b, hence making recovery of the true sparse vector intpossible. [sent-37, score-0.308]

17 Based on its RIP guarantees, i,-minimizatioo can guarantee recovery using just O(k log(n/ k») measurements, but it has been observed in practice that i,-minimization is too expensive in large scale applications [8], for example, when the dimensionality is in the millions. [sent-38, score-0.13]

18 This has sparked a huge interest in other iterative methods for sparse recovery. [sent-39, score-0.099]

19 Indeed, it is known that, if run for k iterations, OMP cannot uoiformly recover all k-sparse vectors assumiug RIP cooditioo of the form 02k :'0 IJ [22, 18]. [sent-42, score-0.109]

20 However, Zhang [26] showed that OMP, if run for 30k iterations, recovers the optimal solution when 03'k :'0 1/3; a significantly more restrictive cooditioo than the ones required by other methods like i,-minimization. [sent-43, score-0.187]

21 In the family ofiterative hard thresholding algorithms, we can identifY two major subfamilies [17]: one- and two-stage algorithms. [sent-47, score-0.267]

22 One-stage algorithms such as IHT, m and HTP, decide on the choice of the next support set and then usually solve a least squares problem on the updated support. [sent-49, score-0.14]

23 On the other hand, two-stage algorithms, notable examples being CoSaMP and SP, first enlarge the support set, solve a least squares 00 it, and then reduce the support set back again to the desired size. [sent-51, score-0.227]

24 However, it differs from our proposed methods as its analysis requires very restrictive RIP cooditioos (08k < 0. [sent-55, score-0.1]

25 1 as quoted in [14]) and the connection to locality sensitive hashing (see below) is not made. [sent-56, score-0.131]

26 In this paper, we present and provide a unified analysis for a family of one-stage iterative hard thresholding algorithms. [sent-60, score-0.352]

27 OMPR can be thought of as a sintple modification of the classic greedy algorithm OMP: instead of sintply adding an element to the existiug support, it replaces an existiug support element with a new one. [sent-64, score-0.167]

28 Surprisingly, this change allows us to prove sparse recovery under the condition 02k < 0. [sent-65, score-0.193]

29 This is the best 02k based RIP condition under which any method, including i, -minimization, is (currently) known to provably perform sparse recovery. [sent-67, score-0.091]

30 OMPR also lends itself to a faster intplententatioo using locality sensitive hashing (LSH). [sent-68, score-0.131]

31 This allows us to provide recovery guarantees using an algorithm whose run-time is provably sub-linear in n, the number of dimensions. [sent-69, score-0.19]

32 As a result, we are able to prove better recovery guarantees for these algorithms: 04k < 0. [sent-73, score-0.162]

33 We hope that this unified analysis sheds more light on the interrelationships between the various kinds of iterative hard thresholding algorithms. [sent-76, score-0.303]

34 • We present a family of iterative hard thresholding algorithms that on one end of the spectrum includes existing methods such as ITIIHTP while on the other end gives OMPR. [sent-78, score-0.326]

35 • Unlike other intprovements over OMP, such as CoSaMP or SP, OMPR changes ouly ooe elentent of the support at a tinte. [sent-80, score-0.113]

36 This allows us to use Locality Sensitive Hashing (LSH) to speed it up resultiug in the first provably sub-linear (in the ambient dimensionality n) time sparse recovery algorithm. [sent-81, score-0.198]

37 +1 \b 1t+l <- 0 • We provide a general proof for all the algorithms in our partial hard thresholding based family. [sent-86, score-0.27]

38 In particular, we can guarantee recovery using OMPR, under both noiseless and noisy settings, provided 02' < 0. [sent-87, score-0.18]

39 cooditioo under which any efficient sparse recovery method is known to work. [sent-90, score-0.228]

40 Furthermore, our proof technique can be used to provide a general theorem that provides the least restrictive known guarantees for all the two-stage algorithms such as CoSaMP and SP (see Appendix D). [sent-91, score-0.175]

41 2 Orthogonal Matching PUl"lIuit with Replacement Orthogonal matching pursuit (OMP), is a classic iterative algorithm for sparse recovery. [sent-93, score-0.266]

42 At every stage, it selecta a coordinate to include in the current support set by maximizing the inner product between columns of the measurement matrix A and the current residnal b - Ax'. [sent-94, score-0.174]

43 Doce the new coordinate has been added, it solves a least squares problem to fully miuimize the error on the current support set As a result, the residnal becomes orthogonal to the columos of A that correspond to the current support set. [sent-95, score-0.282]

44 AB notation: A\b:= argmin IIAx - bl1 2 • z The hard thresholding operator H. [sent-100, score-0.255]

45 Doce the support IHI of the next iterate has been determined, the actna1 iterate X HI is obtained by solving the least squares problem: HI = X argmin IIAx - bli2 . [sent-112, score-0.252]

46 x: supp(z)=It+l Note that if the matrix A satisfies RIP of order k or larger, the above problem will be well conditioned and can be solved quickly and reliably using an iterative least squares solver. [sent-113, score-0.247]

47 We will show that OMPR, uulike OMP, recovers any k-sparse vector under the RIP based cooditioo 02. [sent-114, score-0.131]

48 This appears to be the least restrictive recovery condition (i. [sent-117, score-0.238]

49 , best known coodition) under which any method, be it basis pursuit (ll-minimizatioo) or some iterative algorithm, is guaranteed to recover all k-sparse vectors. [sent-119, score-0.195]

50 4992 "" 8 which makes it heuristically the least restrictive RIP condition for sparse recovery. [sent-126, score-0.148]

51 Suppose the vector x* E IRn is k-sparse and the matrix A satisfies 1i2• < 0. [sent-129, score-0.106]

52 Suppose the vector x* E IRn is k-sparse and the matrix A satisfies 1i2 • < 0. [sent-136, score-0.106]

53 The I-th member of this family, OMPR (I), showo in Algorithm 2, replaces at most 1 elements of the curreot support with new elements. [sent-150, score-0.125]

54 Our first result in this section conoects the OMPR family to hard thresholding. [sent-153, score-0.122]

55 Given a set I of cardinality k, define the partial hard thresholding operator Hk (z; I, I):~ argmin (I) Ily - zll . [sent-154, score-0.307]

56 The name partial hard thresholding operator is justified because of the following reasoning. [sent-156, score-0.278]

57 In fact, itbecomes identical to the standard hard thresholding operator H. [sent-159, score-0.255]

58 The following result shows that even the partial hard thresholding operator is easy to compute. [sent-164, score-0.278]

59 In each iteration (with current iterate x' having support It ~ supp(xt», we do the following: 1. [sent-173, score-0.114]

60 (partial Hard Thresholding) Form VH1 by partially hard thresholding zHI using the operator H. [sent-177, score-0.255]

61 (Least Squares) Form the next iterate X HI by solving a least squares problem on the support IHI ofyHI. [sent-181, score-0.196]

62 A nice property enjoyed by the entire OMPR family is guaranteed sparse recovery under RIP based conditions. [sent-182, score-0.219]

63 Note from below that the condition under which OMPR (I) recovers sparse vectors becomes more restrictive as I increases. [sent-183, score-0.217]

64 This could be an artifact of our analysis, as in experiments, we do not see any degradation in recovery ability as I is increased. [sent-184, score-0.13]

65 1/211Ax - bl1 2 :5 <)from measurements b = Ax* in O( ~ log(k/<» iterations provided we choose a step size 1'/ that satisfies 1'/(1 + 02. [sent-189, score-0.172]

66 , 1/211Ax - bl1 2 :5 IIell 2 + <) from measurements b = Ax' + e in O( log«k + IleI1 2)1<) iterations provided we choose a step size 1'/ that satisfies 1'/(1 + 02,) < 1 and 1'/(1 - 02. [sent-196, score-0.172]

67 Due to the least squares step of the previous iteration, the curreot residual Ax' - b is orthogoual to columns of AI,. [sent-202, score-0.126]

68 " then our method moootonically decreases the objective function and converges to a local optimum even if RIP is not satisfied (note that upper RIP bound is indepeodeot oflower RIP bound, and can always be satisfied by nurma1izing the matrix appropriately). [sent-211, score-0.088]

69 The above geoera1 result for the OMPR family immediately implies that it recovers sparse vectors as soon as the measuremeot matrix A satisfies 02, < 1/3. [sent-232, score-0.293]

70 Suppose the vector x' E an is k-sparse and the matrix A satisfies 02k < 1/3. [sent-234, score-0.106]

71 Then IlIT-Newton recovers x* from measurements b = Ax' in O(1og(k» iterations. [sent-235, score-0.139]

72 5 4 Tighter Analysis of Two Stage Hard Thresholding Algorithms Recently, Maleki and Donoho [17] proposed a novel family of algorithms, namely two-stage hard thresholding algorithms. [sent-236, score-0.267]

73 Doring each iteration, these algorithms add a fixed nwnber (say l) of elements to the current iterate's support set. [sent-237, score-0.103]

74 A least squares problem is solved over the larger support set and then I elements with smallest magnitude are dropped to form next iterate's support set. [sent-238, score-0.198]

75 Next iterate is then obtained by agaiu solviug the least squares over next iterate's support set. [sent-239, score-0.196]

76 Usiug proof techniques developed for our proof of Theorem 4, we can obtain a simple proof for the entire spectrum of algorithms iu the two-stage hard thresholding family. [sent-241, score-0.388]

77 Then the 7Wo-stage Hard Thresholding algorithm with replacement size I recovers x* from measurements b = Ax* in O(k) iterations provided: 6. [sent-244, score-0.19]

78 CoSaMP[l9] recovers k-sparse x* E {-1,0, l}n from measurements b = Ax* provided 64k :::; 0. [sent-250, score-0.139]

79 Subspace Pursuit[4] recovers k-sparse x* E {-I, 0, I}n from measurements b = Ax* provided 63k :::; 0. [sent-253, score-0.139]

80 Note that while LSH is designed for nearest neighbor search (iu terms of Euclidean distances) and iu general might not have any guarantees for the similar neighbor search task, we are still able to apply it to our task because we can lower-hound the similarity of the most similar neighbor. [sent-264, score-0.115]

81 LSH generates hash bits for a vector usiug randoruized hash functions that have the property that the probability of collision between two vectors is proportional to the similarity between them. [sent-266, score-0.267]

82 I af a2 ) 1-;;: cos - I ( Iladlla211' created by randoruly sampling hash functions h,. [sent-270, score-0.106]

83 Next, q hash tables are constructed doring the pre-processiug stage usiug iudependently constructed hash key functions gl, 92, . [sent-274, score-0.286]

84 , gq' Doring the query stage, a query is iudexed iuto each hash table usiug hash-key functions 91, 92, . [sent-277, score-0.165]

85 However, we cannot directly use the above theorem to guarantee convergence of our hashing based OMPR algorithm as our algorithm requires finding the most similar poiut iu terms of magnitude of the iuner product. [sent-292, score-0.18]

86 A detailed proof of the theorem below can be found iu Appendix B. [sent-294, score-0.141]

87 The above theorem shows that the time complexity is sub-liuear iu n. [sent-297, score-0.112]

88 (Best viewed in color) 6 Experimental Results In this section we present empirical results to demonstrate accurate and fast recovery by our OMPR method. [sent-303, score-0.13]

89 In the first set of experiments, we present a phase transition diagram for OMPR and compare it to the phase transition diagrams of OMP and nIT-Newton with step size 1. [sent-304, score-0.251]

90 For the second set of experiments, we demonstrate robostoess of OMPR compared to many existiog methods when measurements are noisy or smaller in number than what is required for exact recovery. [sent-305, score-0.119]

91 For the third set of experiments, we demonstrate efficiency of our LSH based implementation by comparing recovery error and time required for our method with OMP and nIT-Newtoo (with step-size 1 and 1/2). [sent-306, score-0.13]

92 We do not present results for the i,ibasis pursuit methods, as it has a1readybeen shown in several recent papers [10, 17] that the i, relaxation based methods are relatively inefficient for very large scale recovery problems. [sent-307, score-0.24]

93 The underlying k-sparse vectors are generated by randomly selecting a support set of size k and then each entry in the support set is sampled uuiformiy from { +1, -I}. [sent-309, score-0.141]

94 1 Phase Transition Diagrams We first compare different methods using phase transition diagrams which are commouly used in compressed sensing literatore to compare different methods [17]. [sent-313, score-0.213]

95 In Figure 1, we show the phase transition diagram of our OMPR method as well as that ofOMP and nIT-Newtoo (with step size 1). [sent-318, score-0.115]

96 The plots shows probability of successful recovery as a function of p = min and 6 = kim. [sent-319, score-0.13]

97 On the other hand, as expected, the phase transition diagram of OMP has a negligible fraction of the plot that shows high recovery probability. [sent-323, score-0.245]

98 As shown in the phase transition diagrams in Figure 1, OMPR provides comparable recovery to the nIT-Newton method for noiseless cases. [sent-325, score-0.292]

99 We then generate random binary vector x of sparsity k aod add Gaussian noise to it Figure 2 (a) shows recovery error (1iAx - bll) incurred by various methods for increasing k and noise level of 10%. [sent-328, score-0.13]

100 Similarly, Figure 2 (b) shows recovery error incurred by various methods for fixed k = 50 and varying noise level. [sent-330, score-0.175]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ompr', 0.773), ('rip', 0.231), ('omp', 0.189), ('ax', 0.147), ('thresholding', 0.145), ('cosamp', 0.13), ('recovery', 0.13), ('lsh', 0.111), ('pursuit', 0.11), ('satisfies', 0.106), ('supp', 0.093), ('usiug', 0.088), ('iu', 0.083), ('hash', 0.077), ('hard', 0.073), ('recovers', 0.073), ('hashing', 0.068), ('measurements', 0.066), ('iterative', 0.059), ('measurement', 0.059), ('support', 0.058), ('cooditioo', 0.058), ('ihi', 0.058), ('zhi', 0.058), ('restrictive', 0.056), ('iterate', 0.056), ('squares', 0.053), ('diagrams', 0.052), ('sp', 0.052), ('htp', 0.051), ('irn', 0.051), ('replacement', 0.051), ('family', 0.049), ('phase', 0.047), ('fixed', 0.045), ('appeodix', 0.044), ('bll', 0.044), ('cooditioos', 0.044), ('corouary', 0.044), ('curreot', 0.044), ('doring', 0.044), ('iiax', 0.044), ('iiy', 0.044), ('satisfied', 0.044), ('sparse', 0.04), ('locality', 0.039), ('sensing', 0.039), ('compressed', 0.038), ('operator', 0.037), ('transition', 0.037), ('efficiently', 0.035), ('hk', 0.033), ('guarantees', 0.032), ('austin', 0.031), ('diagram', 0.031), ('hi', 0.031), ('xl', 0.029), ('proof', 0.029), ('theorem', 0.029), ('briefly', 0.029), ('choseo', 0.029), ('doce', 0.029), ('enlarge', 0.029), ('eotries', 0.029), ('existiog', 0.029), ('existiug', 0.029), ('extrente', 0.029), ('giveo', 0.029), ('heoce', 0.029), ('ily', 0.029), ('ilz', 0.029), ('isupp', 0.029), ('observatioo', 0.029), ('ooe', 0.029), ('orthogooal', 0.029), ('randoruly', 0.029), ('residnal', 0.029), ('tophi', 0.029), ('zll', 0.029), ('least', 0.029), ('matching', 0.029), ('classic', 0.028), ('provably', 0.028), ('coordinate', 0.028), ('orthogonal', 0.027), ('noiseless', 0.026), ('recover', 0.026), ('axt', 0.026), ('unified', 0.026), ('ouly', 0.026), ('india', 0.026), ('theo', 0.026), ('vectors', 0.025), ('sensitive', 0.024), ('noisy', 0.024), ('subspace', 0.024), ('ensembles', 0.024), ('condition', 0.023), ('replaces', 0.023), ('partial', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 209 nips-2011-Orthogonal Matching Pursuit with Replacement

Author: Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Abstract: In this paper, we consider the problem of compressed sensing where the goal is to recover all sparse vectors using a small number offixed linear measurements. For this problem, we propose a novel partial hard-thresholding operator that leads to a general family of iterative algorithms. While one extreme of the family yields well known hard thresholding algorithms like ITI and HTP[17, 10], the other end of the spectrum leads to a novel algorithm that we call Orthogonal Matching Pursnit with Replacement (OMPR). OMPR, like the classic greedy algorithm OMP, adds exactly one coordinate to the support at each iteration, based on the correlation with the current residnal. However, unlike OMP, OMPR also removes one coordinate from the support. This simple change allows us to prove that OMPR has the best known guarantees for sparse recovery in terms of the Restricted Isometry Property (a condition on the measurement matrix). In contrast, OMP is known to have very weak performance guarantees under RIP. Given its simple structore, we are able to extend OMPR using locality sensitive hashing to get OMPR-Hasb, the first provably sub-linear (in dimensionality) algorithm for sparse recovery. Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursnit. We provide experimental results on large problems providing recovery for vectors of size up to million dimensions. We demonstrste that for large-scale problems our proposed methods are more robust and faster than existing methods.

2 0.11011837 297 nips-2011-Universal low-rank matrix recovery from Pauli measurements

Author: Yi-kai Liu

Abstract: We study the problem of reconstructing an unknown matrix M of rank r and dimension d using O(rd poly log d) Pauli measurements. This has applications in quantum state tomography, and is a non-commutative analogue of a well-known problem in compressed sensing: recovering a sparse vector from a few of its Fourier coefficients. We show that almost all sets of O(rd log6 d) Pauli measurements satisfy the rankr restricted isometry property (RIP). This implies that M can be recovered from a fixed (“universal”) set of Pauli measurements, using nuclear-norm minimization (e.g., the matrix Lasso), with nearly-optimal bounds on the error. A similar result holds for any class of measurements that use an orthonormal operator basis whose elements have small operator norm. Our proof uses Dudley’s inequality for Gaussian processes, together with bounds on covering numbers obtained via entropy duality. 1

3 0.098850533 257 nips-2011-SpaRCS: Recovering low-rank and sparse matrices from compressive measurements

Author: Andrew E. Waters, Aswin C. Sankaranarayanan, Richard Baraniuk

Abstract: We consider the problem of recovering a matrix M that is the sum of a low-rank matrix L and a sparse matrix S from a small set of linear measurements of the form y = A(M) = A(L + S). This model subsumes three important classes of signal recovery problems: compressive sensing, affine rank minimization, and robust principal component analysis. We propose a natural optimization problem for signal recovery under this model and develop a new greedy algorithm called SpaRCS to solve it. Empirically, SpaRCS inherits a number of desirable properties from the state-of-the-art CoSaMP and ADMiRA algorithms, including exponential convergence and efficient implementation. Simulation results with video compressive sensing, hyperspectral imaging, and robust matrix completion data sets demonstrate both the accuracy and efficacy of the algorithm. 1

4 0.09353637 264 nips-2011-Sparse Recovery with Brownian Sensing

Author: Alexandra Carpentier, Odalric-ambrym Maillard, Rémi Munos

Abstract: We consider the problem of recovering the parameter α ∈ RK of a sparse function f (i.e. the number of non-zero entries of α is small compared to the number K of features) given noisy evaluations of f at a set of well-chosen sampling points. We introduce an additional randomization process, called Brownian sensing, based on the computation of stochastic integrals, which produces a Gaussian sensing matrix, for which good recovery properties are proven, independently on the number of sampling points N , even when the features are arbitrarily non-orthogonal. Under the assumption that f is H¨ lder continuous with exponent at least √ we proo 1/2, vide an estimate α of the parameter such that �α − α�2 = O(�η�2 / N ), where � � η is the observation noise. The method uses a set of sampling points uniformly distributed along a one-dimensional curve selected according to the features. We report numerical experiments illustrating our method. 1

5 0.092041582 157 nips-2011-Learning to Search Efficiently in High Dimensions

Author: Zhen Li, Huazhong Ning, Liangliang Cao, Tong Zhang, Yihong Gong, Thomas S. Huang

Abstract: High dimensional similarity search in large scale databases becomes an important challenge due to the advent of Internet. For such applications, specialized data structures are required to achieve computational efficiency. Traditional approaches relied on algorithmic constructions that are often data independent (such as Locality Sensitive Hashing) or weakly dependent (such as kd-trees, k-means trees). While supervised learning algorithms have been applied to related problems, those proposed in the literature mainly focused on learning hash codes optimized for compact embedding of the data rather than search efficiency. Consequently such an embedding has to be used with linear scan or another search algorithm. Hence learning to hash does not directly address the search efficiency issue. This paper considers a new framework that applies supervised learning to directly optimize a data structure that supports efficient large scale search. Our approach takes both search quality and computational cost into consideration. Specifically, we learn a boosted search forest that is optimized using pair-wise similarity labeled examples. The output of this search forest can be efficiently converted into an inverted indexing data structure, which can leverage modern text search infrastructure to achieve both scalability and efficiency. Experimental results show that our approach significantly outperforms the start-of-the-art learning to hash methods (such as spectral hashing), as well as state-of-the-art high dimensional search algorithms (such as LSH and k-means trees).

6 0.089647926 203 nips-2011-On the accuracy of l1-filtering of signals with block-sparse structure

7 0.084884144 265 nips-2011-Sparse recovery by thresholded non-negative least squares

8 0.077346131 182 nips-2011-Nearest Neighbor based Greedy Coordinate Descent

9 0.074323311 189 nips-2011-Non-parametric Group Orthogonal Matching Pursuit for Sparse Learning with Multiple Kernels

10 0.069758847 113 nips-2011-Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms

11 0.063852109 259 nips-2011-Sparse Estimation with Structured Dictionaries

12 0.060602881 239 nips-2011-Robust Lasso with missing and grossly corrupted observations

13 0.059495546 70 nips-2011-Dimensionality Reduction Using the Sparse Linear Model

14 0.055290826 172 nips-2011-Minimax Localization of Structural Information in Large Noisy Matrices

15 0.045917515 276 nips-2011-Structured sparse coding via lateral inhibition

16 0.045345932 91 nips-2011-Exploiting spatial overlap to efficiently compute appearance distances between image windows

17 0.043451846 118 nips-2011-High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity

18 0.040982816 111 nips-2011-Hashing Algorithms for Large-Scale Learning

19 0.037418444 195 nips-2011-On Learning Discrete Graphical Models using Greedy Methods

20 0.037032533 231 nips-2011-Randomized Algorithms for Comparison-based Search


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.112), (1, 0.003), (2, -0.038), (3, -0.086), (4, -0.042), (5, 0.062), (6, 0.016), (7, 0.061), (8, 0.033), (9, -0.012), (10, 0.043), (11, 0.032), (12, -0.06), (13, 0.042), (14, 0.076), (15, -0.065), (16, 0.076), (17, -0.023), (18, -0.048), (19, 0.054), (20, 0.036), (21, 0.034), (22, 0.014), (23, -0.143), (24, -0.043), (25, -0.005), (26, -0.004), (27, 0.014), (28, 0.083), (29, 0.097), (30, -0.065), (31, 0.002), (32, 0.109), (33, -0.058), (34, 0.052), (35, -0.179), (36, -0.009), (37, -0.044), (38, 0.068), (39, -0.076), (40, 0.022), (41, -0.007), (42, 0.133), (43, 0.039), (44, -0.067), (45, 0.04), (46, -0.081), (47, 0.08), (48, 0.005), (49, 0.042)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.90502274 209 nips-2011-Orthogonal Matching Pursuit with Replacement

Author: Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Abstract: In this paper, we consider the problem of compressed sensing where the goal is to recover all sparse vectors using a small number offixed linear measurements. For this problem, we propose a novel partial hard-thresholding operator that leads to a general family of iterative algorithms. While one extreme of the family yields well known hard thresholding algorithms like ITI and HTP[17, 10], the other end of the spectrum leads to a novel algorithm that we call Orthogonal Matching Pursnit with Replacement (OMPR). OMPR, like the classic greedy algorithm OMP, adds exactly one coordinate to the support at each iteration, based on the correlation with the current residnal. However, unlike OMP, OMPR also removes one coordinate from the support. This simple change allows us to prove that OMPR has the best known guarantees for sparse recovery in terms of the Restricted Isometry Property (a condition on the measurement matrix). In contrast, OMP is known to have very weak performance guarantees under RIP. Given its simple structore, we are able to extend OMPR using locality sensitive hashing to get OMPR-Hasb, the first provably sub-linear (in dimensionality) algorithm for sparse recovery. Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursnit. We provide experimental results on large problems providing recovery for vectors of size up to million dimensions. We demonstrste that for large-scale problems our proposed methods are more robust and faster than existing methods.

2 0.73592198 257 nips-2011-SpaRCS: Recovering low-rank and sparse matrices from compressive measurements

Author: Andrew E. Waters, Aswin C. Sankaranarayanan, Richard Baraniuk

Abstract: We consider the problem of recovering a matrix M that is the sum of a low-rank matrix L and a sparse matrix S from a small set of linear measurements of the form y = A(M) = A(L + S). This model subsumes three important classes of signal recovery problems: compressive sensing, affine rank minimization, and robust principal component analysis. We propose a natural optimization problem for signal recovery under this model and develop a new greedy algorithm called SpaRCS to solve it. Empirically, SpaRCS inherits a number of desirable properties from the state-of-the-art CoSaMP and ADMiRA algorithms, including exponential convergence and efficient implementation. Simulation results with video compressive sensing, hyperspectral imaging, and robust matrix completion data sets demonstrate both the accuracy and efficacy of the algorithm. 1

3 0.70861441 264 nips-2011-Sparse Recovery with Brownian Sensing

Author: Alexandra Carpentier, Odalric-ambrym Maillard, Rémi Munos

Abstract: We consider the problem of recovering the parameter α ∈ RK of a sparse function f (i.e. the number of non-zero entries of α is small compared to the number K of features) given noisy evaluations of f at a set of well-chosen sampling points. We introduce an additional randomization process, called Brownian sensing, based on the computation of stochastic integrals, which produces a Gaussian sensing matrix, for which good recovery properties are proven, independently on the number of sampling points N , even when the features are arbitrarily non-orthogonal. Under the assumption that f is H¨ lder continuous with exponent at least √ we proo 1/2, vide an estimate α of the parameter such that �α − α�2 = O(�η�2 / N ), where � � η is the observation noise. The method uses a set of sampling points uniformly distributed along a one-dimensional curve selected according to the features. We report numerical experiments illustrating our method. 1

4 0.68642014 203 nips-2011-On the accuracy of l1-filtering of signals with block-sparse structure

Author: Fatma K. Karzan, Arkadi S. Nemirovski, Boris T. Polyak, Anatoli Juditsky

Abstract: We discuss new methods for the recovery of signals with block-sparse structure, based on 1 -minimization. Our emphasis is on the efficiently computable error bounds for the recovery routines. We optimize these bounds with respect to the method parameters to construct the estimators with improved statistical properties. We justify the proposed approach with an oracle inequality which links the properties of the recovery algorithms and the best estimation performance. 1

5 0.63649732 297 nips-2011-Universal low-rank matrix recovery from Pauli measurements

Author: Yi-kai Liu

Abstract: We study the problem of reconstructing an unknown matrix M of rank r and dimension d using O(rd poly log d) Pauli measurements. This has applications in quantum state tomography, and is a non-commutative analogue of a well-known problem in compressed sensing: recovering a sparse vector from a few of its Fourier coefficients. We show that almost all sets of O(rd log6 d) Pauli measurements satisfy the rankr restricted isometry property (RIP). This implies that M can be recovered from a fixed (“universal”) set of Pauli measurements, using nuclear-norm minimization (e.g., the matrix Lasso), with nearly-optimal bounds on the error. A similar result holds for any class of measurements that use an orthonormal operator basis whose elements have small operator norm. Our proof uses Dudley’s inequality for Gaussian processes, together with bounds on covering numbers obtained via entropy duality. 1

6 0.5403074 73 nips-2011-Divide-and-Conquer Matrix Factorization

7 0.4752312 259 nips-2011-Sparse Estimation with Structured Dictionaries

8 0.46657988 265 nips-2011-Sparse recovery by thresholded non-negative least squares

9 0.44830006 111 nips-2011-Hashing Algorithms for Large-Scale Learning

10 0.42034709 172 nips-2011-Minimax Localization of Structural Information in Large Noisy Matrices

11 0.38528717 239 nips-2011-Robust Lasso with missing and grossly corrupted observations

12 0.36853999 70 nips-2011-Dimensionality Reduction Using the Sparse Linear Model

13 0.34979543 157 nips-2011-Learning to Search Efficiently in High Dimensions

14 0.32372382 276 nips-2011-Structured sparse coding via lateral inhibition

15 0.31870559 195 nips-2011-On Learning Discrete Graphical Models using Greedy Methods

16 0.30992854 108 nips-2011-Greedy Algorithms for Structurally Constrained High Dimensional Problems

17 0.30339977 143 nips-2011-Learning Anchor Planes for Classification

18 0.28944269 211 nips-2011-Penalty Decomposition Methods for Rank Minimization

19 0.27951977 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

20 0.27611884 118 nips-2011-High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.023), (4, 0.028), (20, 0.03), (26, 0.03), (31, 0.057), (33, 0.033), (39, 0.07), (43, 0.088), (45, 0.113), (57, 0.032), (65, 0.021), (74, 0.037), (83, 0.023), (92, 0.285), (99, 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.70966166 209 nips-2011-Orthogonal Matching Pursuit with Replacement

Author: Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Abstract: In this paper, we consider the problem of compressed sensing where the goal is to recover all sparse vectors using a small number offixed linear measurements. For this problem, we propose a novel partial hard-thresholding operator that leads to a general family of iterative algorithms. While one extreme of the family yields well known hard thresholding algorithms like ITI and HTP[17, 10], the other end of the spectrum leads to a novel algorithm that we call Orthogonal Matching Pursnit with Replacement (OMPR). OMPR, like the classic greedy algorithm OMP, adds exactly one coordinate to the support at each iteration, based on the correlation with the current residnal. However, unlike OMP, OMPR also removes one coordinate from the support. This simple change allows us to prove that OMPR has the best known guarantees for sparse recovery in terms of the Restricted Isometry Property (a condition on the measurement matrix). In contrast, OMP is known to have very weak performance guarantees under RIP. Given its simple structore, we are able to extend OMPR using locality sensitive hashing to get OMPR-Hasb, the first provably sub-linear (in dimensionality) algorithm for sparse recovery. Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursnit. We provide experimental results on large problems providing recovery for vectors of size up to million dimensions. We demonstrste that for large-scale problems our proposed methods are more robust and faster than existing methods.

2 0.70913053 38 nips-2011-Anatomically Constrained Decoding of Finger Flexion from Electrocorticographic Signals

Author: Zuoguan Wang, Gerwin Schalk, Qiang Ji

Abstract: Brain-computer interfaces (BCIs) use brain signals to convey a user’s intent. Some BCI approaches begin by decoding kinematic parameters of movements from brain signals, and then proceed to using these signals, in absence of movements, to allow a user to control an output. Recent results have shown that electrocorticographic (ECoG) recordings from the surface of the brain in humans can give information about kinematic parameters (e.g., hand velocity or finger flexion). The decoding approaches in these demonstrations usually employed classical classification/regression algorithms that derive a linear mapping between brain signals and outputs. However, they typically only incorporate little prior information about the target kinematic parameter. In this paper, we show that different types of anatomical constraints that govern finger flexion can be exploited in this context. Specifically, we incorporate these constraints in the construction, structure, and the probabilistic functions of a switched non-parametric dynamic system (SNDS) model. We then apply the resulting SNDS decoder to infer the flexion of individual fingers from the same ECoG dataset used in a recent study. Our results show that the application of the proposed model, which incorporates anatomical constraints, improves decoding performance compared to the results in the previous work. Thus, the results presented in this paper may ultimately lead to neurally controlled hand prostheses with full fine-grained finger articulation. 1

3 0.65888053 9 nips-2011-A More Powerful Two-Sample Test in High Dimensions using Random Projection

Author: Miles Lopes, Laurent Jacob, Martin J. Wainwright

Abstract: We consider the hypothesis testing problem of detecting a shift between the means of two multivariate normal distributions in the high-dimensional setting, allowing for the data dimension p to exceed the sample size n. Our contribution is a new test statistic for the two-sample test of means that integrates a random projection with the classical Hotelling T 2 statistic. Working within a high-dimensional framework that allows (p, n) → ∞, we first derive an asymptotic power function for our test, and then provide sufficient conditions for it to achieve greater power than other state-of-the-art tests. Using ROC curves generated from simulated data, we demonstrate superior performance against competing tests in the parameter regimes anticipated by our theoretical results. Lastly, we illustrate an advantage of our procedure with comparisons on a high-dimensional gene expression dataset involving the discrimination of different types of cancer. 1

4 0.58417028 149 nips-2011-Learning Sparse Representations of High Dimensional Data on Large Scale Dictionaries

Author: Zhen J. Xiang, Hao Xu, Peter J. Ramadge

Abstract: Learning sparse representations on data adaptive dictionaries is a state-of-the-art method for modeling data. But when the dictionary is large and the data dimension is high, it is a computationally challenging problem. We explore three aspects of the problem. First, we derive new, greatly improved screening tests that quickly identify codewords that are guaranteed to have zero weights. Second, we study the properties of random projections in the context of learning sparse representations. Finally, we develop a hierarchical framework that uses incremental random projections and screening to learn, in small stages, a hierarchically structured dictionary for sparse representations. Empirical results show that our framework can learn informative hierarchical sparse representations more efficiently. 1

5 0.54221827 203 nips-2011-On the accuracy of l1-filtering of signals with block-sparse structure

Author: Fatma K. Karzan, Arkadi S. Nemirovski, Boris T. Polyak, Anatoli Juditsky

Abstract: We discuss new methods for the recovery of signals with block-sparse structure, based on 1 -minimization. Our emphasis is on the efficiently computable error bounds for the recovery routines. We optimize these bounds with respect to the method parameters to construct the estimators with improved statistical properties. We justify the proposed approach with an oracle inequality which links the properties of the recovery algorithms and the best estimation performance. 1

6 0.53150421 297 nips-2011-Universal low-rank matrix recovery from Pauli measurements

7 0.52522421 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

8 0.52435642 151 nips-2011-Learning a Tree of Metrics with Disjoint Visual Features

9 0.50385696 118 nips-2011-High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity

10 0.5036422 239 nips-2011-Robust Lasso with missing and grossly corrupted observations

11 0.50316411 182 nips-2011-Nearest Neighbor based Greedy Coordinate Descent

12 0.50214648 84 nips-2011-EigenNet: A Bayesian hybrid of generative and conditional models for sparse learning

13 0.50190431 258 nips-2011-Sparse Bayesian Multi-Task Learning

14 0.50134701 294 nips-2011-Unifying Framework for Fast Learning Rate of Non-Sparse Multiple Kernel Learning

15 0.50130415 195 nips-2011-On Learning Discrete Graphical Models using Greedy Methods

16 0.50057614 265 nips-2011-Sparse recovery by thresholded non-negative least squares

17 0.49959618 159 nips-2011-Learning with the weighted trace-norm under arbitrary sampling distributions

18 0.49859846 4 nips-2011-A Convergence Analysis of Log-Linear Training

19 0.49834615 186 nips-2011-Noise Thresholds for Spectral Clustering

20 0.49804291 70 nips-2011-Dimensionality Reduction Using the Sparse Linear Model