nips nips2009 nips2009-185 knowledge-graph by maker-knowledge-mining

185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

Source: pdf

Author: Sundeep Rangan, Alyson K. Fletcher

Abstract: A well-known analysis of Tropp and Gilbert shows that orthogonal matching pursuit (OMP) can recover a k-sparse n-dimensional real vector from m = 4k log(n) noise-free linear measurements obtained through a random Gaussian measurement matrix with a probability that approaches one as n → ∞. This work strengthens this result by showing that a lower number of measurements, m = 2k log(n − k), is in fact sufﬁcient for asymptotic recovery. More generally, when the sparsity level satisﬁes kmin ≤ k ≤ kmax but is unknown, m = 2kmax log(n − kmin ) measurements is sufﬁcient. Furthermore, this number of measurements is also sufﬁcient for detection of the sparsity pattern (support) of the vector with measurement errors provided the signal-to-noise ratio (SNR) scales to inﬁnity. The scaling m = 2k log(n − k) exactly matches the number of measurements required by the more complex lasso method for signal recovery in a similar SNR scaling.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 More generally, when the sparsity level satisﬁes kmin ≤ k ≤ kmax but is unknown, m = 2kmax log(n − kmin ) measurements is sufﬁcient. [sent-7, score-0.841]

2 Furthermore, this number of measurements is also sufﬁcient for detection of the sparsity pattern (support) of the vector with measurement errors provided the signal-to-noise ratio (SNR) scales to inﬁnity. [sent-8, score-0.519]

3 The scaling m = 2k log(n − k) exactly matches the number of measurements required by the more complex lasso method for signal recovery in a similar SNR scaling. [sent-9, score-0.584]

4 The support of x is the locations of the nonzero entries and is sometimes called its sparsity pattern. [sent-11, score-0.227]

5 A common sparse estimation problem is to infer the sparsity pattern of x from linear measurements of the form y = Ax + w, (1) where A ∈ Rm×n is a known measurement matrix, y ∈ Rm represents a vector of measurements and w ∈ Rm is a vector of measurements errors (noise). [sent-12, score-0.831]

6 Sparsity pattern detection and related sparse estimation problems are classical problems in nonlinear signal processing and arise in a variety of applications including wavelet-based image processing [1] and statistical model selection in linear regression [2]. [sent-13, score-0.175]

7 There has also been considerable recent interest in sparsity pattern detection in the context of compressed sensing, which focuses on large random measurement matrices A [3–5]. [sent-14, score-0.398]

8 It is this scenario with random measurements that will be analyzed here. [sent-15, score-0.154]

9 Optimal subset recovery is NP-hard [6] and usually involves searches over all the n possible k support sets of x. [sent-16, score-0.137]

10 One simple and popular approximate algorithm is orthogonal matching pursuit (OMP) developed in [7–9]. [sent-18, score-0.266]

11 Among other results, Tropp and Gilbert show that when the number of measurements scales as m ≥ (1 + δ)4k log(n) (2) for some δ > 0, A has i. [sent-23, score-0.154]

12 Gaussian entries, and the measurements are noise-free (w = 0), the OMP method will recover the correct sparse pattern of x with a probability that approaches one as n and k → ∞. [sent-26, score-0.292]

13 Deterministic conditions on the matrix A that guarantee recovery of x by OMP are given in [12]. [sent-27, score-0.157]

14 However, numerical experiments reported in [10] suggest that a smaller number of measurements than (2) may be sufﬁcient for asymptotic recovery with OMP. [sent-28, score-0.349]

15 Speciﬁcally, we show that the scaling in measurements m ≥ (1 + δ)2k log(n − k) (3) is also sufﬁcient for asymptotic reliable recovery with OMP provided both n − k and k → ∞. [sent-31, score-0.497]

16 The result goes further by allowing uncertainty in the sparsity level k. [sent-32, score-0.206]

17 While the Tropp–Gilbert analysis requires that the measurements are noise-free, we show that the scaling (3) is also sufﬁcient when there is noise w, provided the signal-to-noise ratio (SNR) goes to inﬁnity. [sent-34, score-0.28]

18 The main signiﬁcance of the new scaling (3) is that it exactly matches the conditions for sparsity pattern recovery using the well-known lasso method. [sent-35, score-0.648]

19 The lasso method, which will be described in detail in Section 4, is based on a convex relaxation of the optimal detection problem. [sent-36, score-0.201]

20 The best analysis of the sparsity pattern recovery with lasso is due to Wainwright [13, 14]. [sent-37, score-0.529]

21 He showed in [13] that under a similar high SNR assumption, the scaling (3) in number of measurements is both necessary and sufﬁcient for asymptotic reliable sparsity pattern detection. [sent-38, score-0.622]

22 1 Now, although the lasso method is often more complex than OMP, it is widely believed that lasso has superior performance [10]. [sent-39, score-0.308]

23 Our results show that at least for sparsity pattern recovery with large Gaussian measurement matrices in high SNR, lasso and OMP have identical performance. [sent-40, score-0.609]

24 Hence, the additional complexity of lasso for these problems is not warranted. [sent-41, score-0.154]

25 Of course, neither lasso nor OMP is the best known approximate algorithm, and our intention is not to claim that OMP is optimal in any sense. [sent-42, score-0.154]

26 For example, where there is no noise in the measurements, the lasso minimization (14) can be replaced by x = arg min v 1 , s. [sent-43, score-0.181]

27 Gaussian measurement matrices, this minimization will recover the correct vector with m ≍ 2k log(n/m) (4) when k ≪ n. [sent-49, score-0.13]

28 This scaling is fundamentally better than the scaling (3) achieved by OMP and lasso. [sent-50, score-0.198]

29 The CoSaMP algorithm of Needell and Tropp [16] and subspace pursuit algorithm of Dai and Milenkovic [17] achieve a scaling similar to (4). [sent-52, score-0.239]

30 Rather, we simply intend to show that both OMP and lasso have similar performance in certain scenarios. [sent-56, score-0.154]

31 First, we account for the effect of the noise by separately considering its effect in the “true” subspace and its orthogonal complement. [sent-58, score-0.12]

32 A comparison to lasso is provided in Section 4, and we suggest some future problems in Section 6. [sent-70, score-0.154]

33 The set Itrue will also be called the sparsity pattern. [sent-74, score-0.181]

34 , of the sparsity pattern Itrue , adding one index at a time. [sent-79, score-0.238]

35 Algorithm 1 (Orthogonal Matching Pursuit) Given a vector y ∈ Rm , a measurement matrix ˆ A ∈ Rm×n and threshold level µ > 0, compute an estimate IOMP of the sparsity pattern of x as follows: ˆ 1. [sent-81, score-0.418]

36 Compute P(t), the projection operator onto the orthogonal complement of the span of ˆ {ai , i ∈ I(t)}. [sent-84, score-0.163]

37 The ﬁnal estimate of the sparsity pattern is IOMP = I(t). [sent-95, score-0.261]

38 ˆ Note that since P(t) is the projection onto the orthogonal complement of aj for all j ∈ I(t), ˆ ˆ P(t)aj = 0 for all j ∈ I(t). [sent-96, score-0.199]

39 ˆ ˆ The algorithm above only provides an estimate, IOMP , of the sparsity pattern of Itrue . [sent-98, score-0.259]

40 The estimate x is the projection of the noisy vector y onto the space spanned by the vectors ai with i in the sparsity ˆ ˆ pattern estimate IOMP . [sent-101, score-0.384]

41 However, this paper only analyzes the sparsity pattern estimate IOMP itself, and not the vector estimate x. [sent-102, score-0.304]

42 Assumption 1 Consider a sequence of sparse recovery problems, indexed by the vector dimension n. [sent-104, score-0.188]

43 Also assume: (a) The sparsity level, k = k(n) satisﬁes k(n) ∈ [kmin (n), kmax (n)], (8) for some deterministic sequences kmin (n) and kmax (n) with kmin (n) → ∞ as n → ∞ and kmax (n) < n/2 for all n. [sent-106, score-0.901]

44 (b) The number of measurements m = m(n) is a deterministic sequence satisfying m ≥ (1 + δ)2kmax log(n − kmin ), (9) for some δ > 0. [sent-107, score-0.361]

45 Assumption 1(a) provides a range on the sparsity level, k. [sent-115, score-0.181]

46 Assumption 1(b) is our the main scaling law on the number of measurements that we will show is sufﬁcient for asymptotic reliable recovery. [sent-117, score-0.436]

47 In the special case when k is known so that kmax = kmin = k, we obtain the simpler scaling law m ≥ (1 + δ)2k log(n − k). [sent-118, score-0.47]

48 (13) We have contrasted this scaling law with the Tropp–Gilbert scaling law (2) in Section 1. [sent-119, score-0.35]

49 We will also compare it to the scaling law for lasso in Section 4. [sent-120, score-0.329]

50 The importance of the smallest component magnitude in the detection of the sparsity pattern was ﬁrst recognized by Wainwright [13,14,20]. [sent-122, score-0.285]

51 Theorem 1 Under Assumption 1, there exists a sequence of threshold levels µ = µ(n) such that the OMP method in Algorithm 1 will asymptotically detect the correct sparsity pattern in that ˆ lim Pr IOMP = Itrue = 0. [sent-131, score-0.378]

52 n→∞ Moreover, the threshold levels µ can be selected simply as a function of kmin , kmax , n, m and δ. [sent-132, score-0.347]

53 Theorem 1 provides our main result and shows that the scaling law (9) is sufﬁcient for asymptotic recovery. [sent-133, score-0.233]

54 4 Comparison to Lasso Performance It is useful to compare the scaling law (13) to the number of measurements required by the widelyused lasso method described for example in [22]. [sent-134, score-0.483]

55 The lasso method ﬁnds an estimate for the vector x in (1) by solving the quadratic program x = arg min y − Av 2 + µ v 1, (14) v∈Rn where µ > 0 is an algorithm parameter that trades off the prediction error with the sparsity of the solution. [sent-135, score-0.399]

56 While the optimization (14) is convex, the running time of lasso is signiﬁcantly longer than OMP unless A has some particular structure [10]. [sent-137, score-0.154]

57 However, it is generally believed that lasso has superior performance. [sent-138, score-0.154]

58 The best analysis of lasso for sparsity pattern recovery for large random matrices is due to Wainwright [13, 14]. [sent-139, score-0.549]

59 Gaussian measurement matrix and white Gaussian noise, the condition (13) is necessary for asymptotic reliable detection of the sparsity pattern. [sent-143, score-0.44]

60 In addition, under the condition (10) on the minimum component magnitude, the scaling (13) is also sufﬁcient. [sent-144, score-0.12]

61 We thus conclude that OMP requires an identical scaling in the number of measurements to lasso. [sent-145, score-0.253]

62 Therefore, at least for sparsity pattern recovery from measurements with large random Gaussian measurement matrices and high SNR, there is no additional performance improvement with the more complex lasso method over OMP. [sent-146, score-0.763]

63 5 Threshold Selection and Stopping Conditions In many problems, the sparsity level k is not known a priori and must be detected as part of the estimation process. [sent-147, score-0.206]

64 In OMP, the sparsity level of estimated vector is precisely the number of iterations conducted before the algorithm terminates. [sent-148, score-0.269]

65 Thus, reliable sparsity level estimation requires a good stopping condition. [sent-149, score-0.321]

66 When the measurements are noise-free and one is concerned only with exact signal recovery, the optimal stopping condition is simple: the algorithm should simply stop whenever there is no more error. [sent-150, score-0.325]

67 The OMP method as described in Algorithm 1 uses a stopping condition based on testing if ρ∗ (t) > µ for some threshold µ. [sent-153, score-0.139]

68 One of the appealing features of Theorem 1 is that it provides a simple sufﬁcient condition under which this threshold mechanism will detect the correct sparsity level. [sent-154, score-0.304]

69 Speciﬁcally, Theorem 1 provides a range k ∈ [kmin , kmax ] under which there exists a threshold that the OMP algorithm will terminate in the correct number of iterations. [sent-155, score-0.233]

70 Of course, in practice, one may deliberately want to stop the OMP algorithm with fewer iterations than the “true” sparsity level. [sent-158, score-0.247]

71 As the OMP method proceeds, the detection becomes less reliable and it is sometimes useful to stop the algorithm whenever there is a high chance of error. [sent-159, score-0.14]

72 However, since our analysis is only concerned with exact sparsity pattern recovery, we do not consider this type of stopping condition. [sent-161, score-0.304]

73 6 Conclusions and Future Work We have provided an improved scaling law on the number of measurements for asymptotic reliable sparsity pattern detection with OMP. [sent-162, score-0.721]

74 This scaling law exactly matches the scaling needed by lasso under similar conditions. [sent-163, score-0.428]

75 Also, our analysis has been restricted to exact sparsity pattern recovery. [sent-167, score-0.238]

76 However, in many problems, especially with noise, it is not necessary to detect every component in the sparsity pattern. [sent-168, score-0.224]

77 It would be useful if partial support recovery results such as [24–27] can be obtained for OMP. [sent-169, score-0.137]

78 Finally, our main scaling law (9) is only sufﬁcient. [sent-170, score-0.175]

79 While numerical experiments in [10, 28] suggest that this scaling is also necessary for vectors with equal magnitude, it is possible that OMP can perform better than the scaling law (9) when the component magnitudes have some variation; this is demonstrated numerically in [28]. [sent-171, score-0.325]

80 Compute Ptrue (t), the projection operator onto the orthogonal complement of the span of {ai , i ∈ Itrue (t)}. [sent-181, score-0.163]

81 The ﬁnal estimate of the sparsity pattern is Itrue (k). [sent-192, score-0.261]

82 This “genie” algorithm is identical to the regular OMP method in Algorithm 1, except that it runs for precisely k iterations as opposed to using a threshold µ for the stopping condition. [sent-193, score-0.161]

83 Also, in the maximization in (16), the genie algorithm searches over only the correct indices j ∈ Itrue . [sent-194, score-0.187]

84 Hence, this genie algorithm can never select an incorrect index j ∈ Itrue . [sent-195, score-0.198]

85 Also, as in the regular OMP algorithm, the genie algorithm will never select the same vector twice for almost all vectors y. [sent-196, score-0.203]

86 Therefore, after k iterations, the genie algorithm will have selected all the k indices in Itrue and terminate with correct sparsity pattern estimate Itrue (k) = Itrue with probability one. [sent-197, score-0.468]

87 A simple induction argument shows that if there are no missed detections or false alarms, the true OMP algorithm will select the same vectors as the “genie” algorithm, and therefore recover the sparsity pattern. [sent-208, score-0.329]

88 Then, deﬁne the threshold level µ = µ(n) = (19) 2(1 + ǫ) log(n − kmin ). [sent-212, score-0.263]

89 (21) This is done by separately considering the components of w in the span of the vectors aj for j ∈ Itrue and its orthogonal complement. [sent-216, score-0.23]

90 Near-optimal signal recovery from random projections: Universal e encoding strategies? [sent-295, score-0.177]

91 Signal recovery from random measurements via orthogonal matching pursuit. [sent-340, score-0.438]

92 Signal recovery from random measurements via orthogonal matching pursuit: The Gaussian case. [sent-349, score-0.438]

93 Sharp thresholds for high-dimensional and noisy recovery of sparsity. [sent-366, score-0.191]

94 Sharp thresholds for high-dimensional and noisy sparsity recovery using ℓ1 -constrained quadratic programming (lasso). [sent-375, score-0.372]

95 CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. [sent-393, score-0.177]

96 Subspace pursuit for compressive sensing: Closing the gap between performance and complexity. [sent-402, score-0.129]

97 Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit. [sent-414, score-0.181]

98 Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit. [sent-419, score-0.324]

99 Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting. [sent-427, score-0.376]

100 Sparse support recovery from random measurements with orthogonal matching pursuit. [sent-504, score-0.438]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('itrue', 0.541), ('omp', 0.481), ('snr', 0.198), ('tropp', 0.191), ('kmin', 0.186), ('sparsity', 0.181), ('lasso', 0.154), ('measurements', 0.154), ('iomp', 0.152), ('gilbert', 0.138), ('recovery', 0.137), ('genie', 0.135), ('ptrue', 0.118), ('kmax', 0.109), ('pmd', 0.102), ('scaling', 0.099), ('pursuit', 0.098), ('orthogonal', 0.093), ('pfa', 0.085), ('law', 0.076), ('fletcher', 0.068), ('stopping', 0.066), ('measurement', 0.06), ('aj', 0.059), ('rangan', 0.059), ('asymptotic', 0.058), ('pattern', 0.057), ('brownian', 0.055), ('matching', 0.054), ('threshold', 0.052), ('reliable', 0.049), ('detection', 0.047), ('nonzero', 0.046), ('arxiv', 0.046), ('sensing', 0.043), ('incorrect', 0.042), ('signal', 0.04), ('needell', 0.038), ('lim', 0.038), ('proof', 0.036), ('december', 0.036), ('missed', 0.034), ('donoho', 0.034), ('akcakaya', 0.034), ('stagewise', 0.034), ('suf', 0.033), ('compressed', 0.033), ('noisy', 0.031), ('compressive', 0.031), ('correct', 0.031), ('sparse', 0.031), ('california', 0.031), ('berkeley', 0.03), ('alyson', 0.03), ('march', 0.028), ('components', 0.028), ('noise', 0.027), ('increment', 0.027), ('dai', 0.027), ('true', 0.027), ('limits', 0.027), ('vectors', 0.027), ('january', 0.026), ('rm', 0.026), ('november', 0.025), ('cosamp', 0.025), ('level', 0.025), ('identically', 0.025), ('ax', 0.025), ('complement', 0.025), ('max', 0.025), ('log', 0.024), ('alarm', 0.024), ('necessary', 0.024), ('pr', 0.023), ('stop', 0.023), ('april', 0.023), ('estimate', 0.023), ('thresholds', 0.023), ('span', 0.023), ('wainwright', 0.023), ('iterations', 0.022), ('projection', 0.022), ('condition', 0.021), ('edition', 0.021), ('deterministic', 0.021), ('av', 0.021), ('algorithm', 0.021), ('motion', 0.021), ('assumption', 0.021), ('conditions', 0.02), ('matrices', 0.02), ('vector', 0.02), ('terminate', 0.02), ('june', 0.02), ('false', 0.02), ('gaussian', 0.019), ('detect', 0.019), ('recover', 0.019), ('october', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

Author: Sundeep Rangan, Alyson K. Fletcher

2 0.12635946 157 nips-2009-Multi-Label Prediction via Compressed Sensing

Author: John Langford, Tong Zhang, Daniel J. Hsu, Sham M. Kakade

Abstract: We consider multi-label prediction problems with large output spaces under the assumption of output sparsity – that the target (label) vectors have small support. We develop a general theory for a variant of the popular error correcting output code scheme, using ideas from compressed sensing for exploiting this sparsity. The method can be regarded as a simple reduction from multi-label regression problems to binary regression problems. We show that the number of subproblems need only be logarithmic in the total number of possible labels, making this approach radically more efﬁcient than others. We also state and prove robustness guarantees for this method in the form of regret transform bounds (in general), and also provide a more detailed analysis for the linear prediction setting. 1

3 0.11871736 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

Author: Shuheng Zhou

Abstract: Given n noisy samples with p dimensions, where n ≪ p, we show that the multistep thresholding procedure can accurately estimate a sparse vector β ∈ Rp in a linear model, under the restricted eigenvalue conditions (Bickel-Ritov-Tsybakov 09). Thus our conditions for model selection consistency are considerably weaker than what has been achieved in previous works. More importantly, this method allows very signiﬁcant values of s, which is the number of non-zero elements in the true parameter. For example, it works for cases where the ordinary Lasso would have failed. Finally, we show that if X obeys a uniform uncertainty principle and if the true parameter is sufﬁciently sparse, the Gauss-Dantzig selector (Cand` se Tao 07) achieves the ℓ2 loss within a logarithmic factor of the ideal mean square error one would achieve with an oracle which would supply perfect information about which coordinates are non-zero and which are above the noise level, while selecting a sufﬁciently sparse model. 1

4 0.098146193 105 nips-2009-Grouped Orthogonal Matching Pursuit for Variable Selection and Prediction

Author: Grzegorz Swirszcz, Naoki Abe, Aurelie C. Lozano

Abstract: We consider the problem of variable group selection for least squares regression, namely, that of selecting groups of variables for best regression performance, leveraging and adhering to a natural grouping structure within the explanatory variables. We show that this problem can be efﬁciently addressed by using a certain greedy style algorithm. More precisely, we propose the Group Orthogonal Matching Pursuit algorithm (Group-OMP), which extends the standard OMP procedure (also referred to as “forward greedy feature selection algorithm” for least squares regression) to perform stage-wise group variable selection. We prove that under certain conditions Group-OMP can identify the correct (groups of) variables. We also provide an upperbound on the l∞ norm of the difference between the estimated regression coefﬁcients and the true coefﬁcients. Experimental results on simulated and real world datasets indicate that Group-OMP compares favorably to Group Lasso, OMP and Lasso, both in terms of variable selection and prediction accuracy. 1

5 0.096045077 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

Author: Sahand Negahban, Bin Yu, Martin J. Wainwright, Pradeep K. Ravikumar

Abstract: High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of structure (e.g., sparse vectors; block-structured matrices; low-rank matrices; Markov assumptions). In such settings, a general approach to estimation is to solve a regularized convex program (known as a regularized M -estimator) which combines a loss function (measuring how well the model ﬁts the data) with some regularization function that encourages the assumed structure. The goal of this paper is to provide a uniﬁed framework for establishing consistency and convergence rates for such regularized M estimators under high-dimensional scaling. We state one main theorem and show how it can be used to re-derive several existing results, and also to obtain several new results on consistency and convergence rates. Our analysis also identiﬁes two key properties of loss and regularization functions, referred to as restricted strong convexity and decomposability, that ensure the corresponding regularized M -estimators have fast convergence rates. 1

6 0.091283783 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

7 0.078896321 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors

8 0.0766109 167 nips-2009-Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations

9 0.073045604 93 nips-2009-Fast Image Deconvolution using Hyper-Laplacian Priors

10 0.070653655 173 nips-2009-Nonparametric Greedy Algorithms for the Sparse Learning Problem

11 0.065894447 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

12 0.064633779 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

13 0.063066505 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

14 0.060241524 144 nips-2009-Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness

15 0.056891315 137 nips-2009-Learning transport operators for image manifolds

16 0.056725286 55 nips-2009-Compressed Least-Squares Regression

17 0.053590965 79 nips-2009-Efficient Recovery of Jointly Sparse Vectors

18 0.052874669 147 nips-2009-Matrix Completion from Noisy Entries

19 0.052247617 200 nips-2009-Reconstruction of Sparse Circuits Using Multi-neuronal Excitation (RESCUME)

20 0.049484663 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.13), (1, 0.06), (2, 0.016), (3, 0.074), (4, 0.003), (5, -0.019), (6, 0.102), (7, -0.13), (8, 0.15), (9, 0.067), (10, -0.024), (11, -0.01), (12, -0.004), (13, 0.052), (14, 0.023), (15, 0.082), (16, -0.017), (17, -0.045), (18, -0.043), (19, 0.04), (20, -0.001), (21, 0.061), (22, 0.053), (23, 0.116), (24, 0.051), (25, 0.051), (26, -0.069), (27, -0.039), (28, -0.082), (29, 0.013), (30, -0.143), (31, 0.043), (32, -0.034), (33, 0.043), (34, 0.118), (35, 0.039), (36, 0.022), (37, -0.029), (38, 0.007), (39, -0.089), (40, 0.04), (41, -0.032), (42, 0.085), (43, -0.044), (44, -0.065), (45, 0.001), (46, 0.003), (47, 0.047), (48, -0.001), (49, 0.037)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94492596 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

Author: Sundeep Rangan, Alyson K. Fletcher

2 0.73740083 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

Author: Sundeep Rangan, Vivek Goyal, Alyson K. Fletcher

Abstract: The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other beneﬁts, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.

3 0.71267796 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

Author: Shuheng Zhou

4 0.63757128 105 nips-2009-Grouped Orthogonal Matching Pursuit for Variable Selection and Prediction

Author: Grzegorz Swirszcz, Naoki Abe, Aurelie C. Lozano

5 0.63198084 173 nips-2009-Nonparametric Greedy Algorithms for the Sparse Learning Problem

Author: Han Liu, Xi Chen

Abstract: This paper studies the forward greedy strategy in sparse nonparametric regression. For additive models, we propose an algorithm called additive forward regression; for general multivariate models, we propose an algorithm called generalized forward regression. Both algorithms simultaneously conduct estimation and variable selection in nonparametric settings for the high dimensional sparse learning problem. Our main emphasis is empirical: on both simulated and real data, these two simple greedy methods can clearly outperform several state-ofthe-art competitors, including LASSO, a nonparametric version of LASSO called the sparse additive model (SpAM) and a recently proposed adaptive parametric forward-backward algorithm called Foba. We also provide some theoretical justiﬁcations of speciﬁc versions of the additive forward regression. 1

6 0.62799948 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

7 0.61819267 157 nips-2009-Multi-Label Prediction via Compressed Sensing

8 0.61059749 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors

9 0.5730108 138 nips-2009-Learning with Compressible Priors

10 0.51447481 79 nips-2009-Efficient Recovery of Jointly Sparse Vectors

11 0.48392001 55 nips-2009-Compressed Least-Squares Regression

12 0.4696269 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

13 0.46755946 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

14 0.46713704 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

15 0.45916128 7 nips-2009-A Data-Driven Approach to Modeling Choice

16 0.42823234 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

17 0.42046082 93 nips-2009-Fast Image Deconvolution using Hyper-Laplacian Priors

18 0.41763127 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

19 0.39316773 167 nips-2009-Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations

20 0.38937694 200 nips-2009-Reconstruction of Sparse Circuits Using Multi-neuronal Excitation (RESCUME)

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.013), (24, 0.06), (25, 0.106), (35, 0.07), (36, 0.11), (39, 0.024), (58, 0.103), (61, 0.015), (71, 0.039), (77, 0.284), (81, 0.038), (86, 0.039)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.80771261 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

Author: Sundeep Rangan, Alyson K. Fletcher

2 0.8057282 106 nips-2009-Heavy-Tailed Symmetric Stochastic Neighbor Embedding

Author: Zhirong Yang, Irwin King, Zenglin Xu, Erkki Oja

Abstract: Stochastic Neighbor Embedding (SNE) has shown to be quite promising for data visualization. Currently, the most popular implementation, t-SNE, is restricted to a particular Student t-distribution as its embedding distribution. Moreover, it uses a gradient descent algorithm that may require users to tune parameters such as the learning step size, momentum, etc., in ﬁnding its optimum. In this paper, we propose the Heavy-tailed Symmetric Stochastic Neighbor Embedding (HSSNE) method, which is a generalization of the t-SNE to accommodate various heavytailed embedding similarity functions. With this generalization, we are presented with two difﬁculties. The ﬁrst is how to select the best embedding similarity among all heavy-tailed functions and the second is how to optimize the objective function once the heavy-tailed function has been selected. Our contributions then are: (1) we point out that various heavy-tailed embedding similarities can be characterized by their negative score functions. Based on this ﬁnding, we present a parameterized subset of similarity functions for choosing the best tail-heaviness for HSSNE; (2) we present a ﬁxed-point optimization algorithm that can be applied to all heavy-tailed functions and does not require the user to set any parameters; and (3) we present two empirical studies, one for unsupervised visualization showing that our optimization algorithm runs as fast and as good as the best known t-SNE implementation and the other for semi-supervised visualization showing quantitative superiority using the homogeneity measure as well as qualitative advantage in cluster separation over t-SNE.

3 0.74949878 237 nips-2009-Subject independent EEG-based BCI decoding

Author: Siamac Fazli, Cristian Grozea, Marton Danoczy, Benjamin Blankertz, Florin Popescu, Klaus-Robert Müller

Abstract: In the quest to make Brain Computer Interfacing (BCI) more usable, dry electrodes have emerged that get rid of the initial 30 minutes required for placing an electrode cap. Another time consuming step is the required individualized adaptation to the BCI user, which involves another 30 minutes calibration for assessing a subject’s brain signature. In this paper we aim to also remove this calibration proceedure from BCI setup time by means of machine learning. In particular, we harvest a large database of EEG BCI motor imagination recordings (83 subjects) for constructing a library of subject-speciﬁc spatio-temporal ﬁlters and derive a subject independent BCI classiﬁer. Our ofﬂine results indicate that BCI-na¨ve ı users could start real-time BCI use with no prior calibration at only a very moderate performance loss.

4 0.74007618 136 nips-2009-Learning to Rank by Optimizing NDCG Measure

Author: Hamed Valizadegan, Rong Jin, Ruofei Zhang, Jianchang Mao

Abstract: Learning to rank is a relatively new ﬁeld of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precision (MAP) [2]. Until recently, most learning to rank algorithms were not using a loss function related to the above mentioned evaluation measures. The main difﬁculty in direct optimization of these measures is that they depend on the ranks of documents, not the numerical values output by the ranking function. We propose a probabilistic framework that addresses this challenge by optimizing the expectation of NDCG over all the possible permutations of documents. A relaxation strategy is used to approximate the average of NDCG over the space of permutation, and a bound optimization approach is proposed to make the computation efﬁcient. Extensive experiments show that the proposed algorithm outperforms state-of-the-art ranking algorithms on several benchmark data sets. 1

5 0.61070269 199 nips-2009-Ranking Measures and Loss Functions in Learning to Rank

Author: Wei Chen, Tie-yan Liu, Yanyan Lan, Zhi-ming Ma, Hang Li

Abstract: Learning to rank has become an important research topic in machine learning. While most learning-to-rank methods learn the ranking functions by minimizing loss functions, it is the ranking measures (such as NDCG and MAP) that are used to evaluate the performance of the learned ranking functions. In this work, we reveal the relationship between ranking measures and loss functions in learningto-rank methods, such as Ranking SVM, RankBoost, RankNet, and ListMLE. We show that the loss functions of these methods are upper bounds of the measurebased ranking errors. As a result, the minimization of these loss functions will lead to the maximization of the ranking measures. The key to obtaining this result is to model ranking as a sequence of classiﬁcation tasks, and deﬁne a so-called essential loss for ranking as the weighted sum of the classiﬁcation errors of individual tasks in the sequence. We have proved that the essential loss is both an upper bound of the measure-based ranking errors, and a lower bound of the loss functions in the aforementioned methods. Our proof technique also suggests a way to modify existing loss functions to make them tighter bounds of the measure-based ranking errors. Experimental results on benchmark datasets show that the modiﬁcations can lead to better ranking performances, demonstrating the correctness of our theoretical analysis. 1

6 0.59224993 105 nips-2009-Grouped Orthogonal Matching Pursuit for Variable Selection and Prediction

7 0.58885843 230 nips-2009-Statistical Consistency of Top-k Ranking

8 0.58512419 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

9 0.58509338 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

10 0.58172327 254 nips-2009-Variational Gaussian-process factor analysis for modeling spatio-temporal data

11 0.57902563 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

12 0.57742327 97 nips-2009-Free energy score space

13 0.5753426 174 nips-2009-Nonparametric Latent Feature Models for Link Prediction

14 0.57402682 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

15 0.57145441 215 nips-2009-Sensitivity analysis in HMMs with application to likelihood maximization

16 0.57105041 173 nips-2009-Nonparametric Greedy Algorithms for the Sparse Learning Problem

17 0.5698778 35 nips-2009-Approximating MAP by Compensating for Structural Relaxations

18 0.56972885 214 nips-2009-Semi-supervised Regression using Hessian energy with an application to semi-supervised dimensionality reduction

19 0.56851196 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

20 0.56806785 162 nips-2009-Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling