nips nips2009 nips2009-36 knowledge-graph by maker-knowledge-mining

36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing


Source: pdf

Author: Sundeep Rangan, Vivek Goyal, Alyson K. Fletcher

Abstract: The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. [sent-9, score-0.872]

2 This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. [sent-10, score-0.762]

3 It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. [sent-11, score-0.365]

4 The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. [sent-12, score-0.746]

5 u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. [sent-13, score-1.025]

6 In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. [sent-14, score-0.527]

7 Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability. [sent-15, score-0.807]

8 1 Introduction Estimating a vector x ∈ Rn from measurements of the form y = Φx + w, (1) where Φ ∈ Rm×n represents a known measurement matrix and w ∈ Rm represents measurement errors or noise, is a generic problem that arises in a range of circumstances. [sent-16, score-0.143]

9 One of the most basic estimators for x is the maximum a posteriori (MAP) estimate ˆ xmap (y) = arg max px|y (x|y), (2) x∈Rn which is defined assuming some prior on x. [sent-17, score-0.371]

10 For most priors, the MAP estimate is nonlinear and its behavior is not easily characterizable. [sent-18, score-0.076]

11 Even if the priors for x and w are separable, the analysis of the MAP estimate may be difficult since the matrix Φ couples the n unknown components of x with the m measurements in the vector y. [sent-19, score-0.14]

12 The primary contribution of this paper—an abridged version of [1]—is to show that with certain large random Φ and Gaussian w, there is an asymptotic decoupling of (1) into n scalar MAP estimation problems. [sent-20, score-0.305]

13 Each equivalent scalar problem has an appropriate scalar prior and Gaussian noise with an effective noise level. [sent-21, score-0.555]

14 The analysis yields the asymptotic joint distribution of each component ˆ xj of x and its corresponding estimate xj in the MAP estimate vector xmap (y). [sent-22, score-0.554]

15 Our analysis is based on a powerful but non-rigorous technique from statistical physics known as the replica method. [sent-26, score-0.777]

16 The replica method was originally developed by Edwards and Anderson [2] to study the statistical mechanics of spin glasses. [sent-27, score-0.797]

17 The replica method was first applied to the study of nonlinear MAP estimation problems by Tanaka [4] and M¨ ller [5]. [sent-29, score-0.81]

18 These papers studied the behavior of the MAP estimator of a vector u x with i. [sent-30, score-0.144]

19 binary components observed through linear measurements of the form (1) with a large random Φ and Gaussian w. [sent-33, score-0.075]

20 Guo and Verd´ ’s result was also able to incoru u porate a large class of minimum postulated MSE estimators, where the estimator may assume a prior that is different from the actual prior. [sent-35, score-0.27]

21 The non-rigorous aspect of the replica method involves a set of assumptions that include a selfaveraging property, the validity of a “replica trick,” and the ability to exchange certain limits. [sent-38, score-0.777]

22 Also, some of the predictions of the replica method have been validated rigorously by other means [8]. [sent-40, score-0.746]

23 As an application of our main result, we will develop a few analyses of estimation problems that arise in compressed sensing [9–11]. [sent-44, score-0.185]

24 In compressed sensing, one estimates a sparse vector x from random linear measurements. [sent-45, score-0.132]

25 Generically, optimal estimation of x with a sparse prior is NP-hard [12]. [sent-46, score-0.098]

26 Thus, most attention has focused on greedy heuristics such as matching pursuit and convex relaxations such as basis pursuit [13] or lasso [14]. [sent-47, score-0.218]

27 Recent compressed sensing research has provided scaling laws on numbers of measurements that guarantee good performance of these methods [15–17]. [sent-49, score-0.202]

28 There are, of course, notable exceptions including [18] and [19] which provide matching necessary and sufficient conditions for recovery of strictly sparse vectors with basis pursuit and lasso. [sent-51, score-0.142]

29 However, even these results only consider exact recovery and are limited to measurements that are noise-free or measurements with a signal-to-noise ratio (SNR) that scales to infinity. [sent-52, score-0.137]

30 Many common sparse estimators can be seen as MAP estimators with certain postulated priors. [sent-53, score-0.403]

31 Most importantly, lasso and basis pursuit are MAP estimators assuming a Laplacian prior. [sent-54, score-0.271]

32 Other commonly-used sparse estimation algorithms, including linear estimation with and without thresholding and zero norm-regularized estimators, can also be seen as MAP-based estimators. [sent-55, score-0.184]

33 For these algorithms, the replica method provides—under the assumption of the replica hypotheses—not just bounds, but the exact asymptotic behavior. [sent-56, score-1.573]

34 2 Estimation Problem and Assumptions Consider the estimation of a random vector x ∈ Rn from linear measurements of the form y = Φx + w = AS1/2 x + w, m where y ∈ R is a vector of observations, Φ = AS a diagonal matrix of positive scale factors, 1/2 ,A∈R m×n S = diag (s1 , . [sent-59, score-0.13]

35 , sn ) , sj > 0, (3) is a measurement matrix, S is (4) and w ∈ Rm is zero-mean, white Gaussian noise. [sent-62, score-0.09]

36 2 The components xj of x are modeled as zero mean and i. [sent-65, score-0.141]

37 The per-component variance of the Gaussian noise is E|wj |2 = σ0 . [sent-69, score-0.09]

38 We use the subscript “0” on the prior and noise level to differentiate these quantities from certain “postulated” values to be defined later. [sent-70, score-0.145]

39 Variations in the power of x that are not known to the estimator should be captured in the distribution of x. [sent-78, score-0.098]

40 We summarize the situation and make additional assumptions to specify the problem precisely as follows: (a) The number of measurements m = m(n) is a deterministic quantity that varies with n and satisfies lim n/m(n) = β n→∞ for some β ≥ 0. [sent-79, score-0.14]

41 (c) (d) (e) (f) 2 The noise w is Gaussian with w ∼ N (0, σ0 Im ). [sent-85, score-0.09]

42 The scale factor matrix S, measurement matrix A, vector x and noise w are independent. [sent-94, score-0.147]

43 Suppose one is given a u 2 “postulated” prior distribution ppost and a postulated noise level σpost that may be different from 2 the true values p0 and σ0 . [sent-96, score-0.426]

44 The Replica MMSE Claim describes the asymptotic behavior of the postulated MMSE estimator via an equivalent scalar estimator. [sent-98, score-0.52]

45 2πµ (7) The distribution px|z (x|z ; q, µ) is the conditional distribution of the scalar random variable x ∼ q(x) from an observation of the form √ (8) z = x + µv, where v ∼ N (0, 1). [sent-101, score-0.159]

46 Using this distribution, we can define the scalar conditional MMSE estimate, xmmse (z ; q, µ) = ˆscalar xpx|z (x|z ; µ) dx. [sent-102, score-0.207]

47 Let xmpmse (y) be the 2 MPMSE estimator based on a postulated prior ppost and postulated noise level σpost . [sent-106, score-0.725]

48 (b) The effective noise levels satisfy the equations 2 σeff 2 σp−eff 2 = σ0 + βE [s mse(ppost , p0 , µp , µ, z)] = 2 σpost + βE [s mse(ppost , ppost , µp , µp , z)] , (12a) (12b) where the expectations are taken over s ∼ pS (s) and z generated by (11). [sent-113, score-0.408]

49 The Replica MMSE Claim asserts that the asymptotic behavior of the joint estimation of the ndimensional vector x can be described by n equivalent scalar estimators. [sent-114, score-0.359]

50 In the scalar estimation problem, a component x ∼ p0 (x) is corrupted by additive Gaussian noise yielding a noisy mea2 surement z. [sent-115, score-0.352]

51 The additive noise variance is µ = σeff /s, which is the effective noise divided by the scale factor s. [sent-116, score-0.218]

52 The estimate of that component is then described by the (generally nonlinear) scalar estimator x(z ; ppost , µp ). [sent-117, score-0.462]

53 ˆ 2 2 The effective noise levels σeff and σp−eff are described by the solutions to fixed-point equations 2 2 (12). [sent-118, score-0.217]

54 Let X ⊆ R be some (measurable) set and consider an estimator of the form n 1 ˆ f (xj ), (13) y − AS1/2 x 2 + xmap (y) = arg min 2 x∈X n 2γ j=1 where γ > 0 is an algorithm parameter and f : X → R is some scalar-valued, non-negative cost function. [sent-122, score-0.315]

55 The estimator (13) can be interpreted as a MAP estimator. [sent-124, score-0.098]

56 Specifically, for any u > 0, it can be ˆ verified that xmap (y) is the MAP estimate 2 ˆ xmap (y) = arg max px|y (x | y ; pu , σu ), x∈X n 2 where pu (x) and σu are the prior and noise level −1 pu (x) = exp(−uf (x))dx x∈X n 4 2 exp(−uf (x)), σu = γ/u, (14) where f (x) = estimators j f (xj ). [sent-125, score-0.851]

57 To analyze this MAP estimator, we consider a sequence of MMSE 2 (15) xu (y) = E x | y ; pu , σu . [sent-126, score-0.097]

58 The proof of the Replica MAP Claim below (see [1]) uses a standard large deviations argument to show that ˆ lim xu (y) = xmap (y) u→∞ for all y. [sent-127, score-0.273]

59 Under the assumption that the behaviors of the MMSE estimators are described by the Replica MMSE Claim, we can then extrapolate the behavior of the MAP estimator. [sent-128, score-0.132]

60 To state the claim, define the scalar MAP estimator xmap (z ; λ) = arg min F (x, z, λ), F (x, z, λ) = ˆscalar x∈X 1 |z − x|2 + f (x). [sent-129, score-0.457]

61 We also assume that the limit |x − x|2 ˆ , x→ˆ 2(F (x, z, λ) − F (ˆ, z, λ)) x x σ 2 (z, λ) = lim (17) exists where x = xmap (z; λ). [sent-131, score-0.244]

62 We make the following additional assumptions: ˆ ˆscalar Assumption 1 Consider the MAP estimator (13) applied to the estimation problem in Section 2. [sent-132, score-0.148]

63 Assume: 2 (a) For all u > 0 sufficiently large, assume the postulated prior pu and noise level σu satisfy the Replica MMSE Claim. [sent-133, score-0.373]

64 Also, assume that for the corresponding effective noise levels, 2 2 σeff (u) and σp−eff (u), the following limits exists: 2 2 2 σeff,map = lim σeff (u), γp = lim uσp−eff (u). [sent-134, score-0.285]

65 u→∞ u→∞ (b) Suppose for each n, xu (n) is the MMSE estimate of the component xj for some index ˆj 2 j ∈ {1, . [sent-135, score-0.2]

66 , n} based on the postulated prior pu and noise level σu . [sent-138, score-0.351]

67 Then, assume that the following limits can be interchanged: lim lim xu (n) = lim lim xu (n), ˆj ˆj u→∞ n→∞ n→∞ u→∞ where the limits are in distribution. [sent-139, score-0.372]

68 Let xmap (y) be the MAP estimator (13) defined for some f (x) and γ > 0 satisfying Assumption 1. [sent-146, score-0.279]

69 Then: (a) As n → ∞, the random vectors (xj , sj , xmap ) converge in distribution to the random ˆj vector (x, s, x) where x, s, and v are independent with x ∼ p0 (x), s ∼ pS (s), v ∼ N (0, 1), ˆ and √ x = xmap (z, λp ), z = x + µv, ˆ ˆscalar (18) 2 where µ = σeff,map /s and λp = γp /s. [sent-151, score-0.429]

70 2 (b) The limiting effective noise levels σeff,map and γp satisfy the equations 2 σeff,map γp 2 = σ0 + βE s|x − x|2 ˆ 2 = γ + βE sσ (z, λp ) , (19a) (19b) where the expectations are taken over x ∼ p0 (x), s ∼ pS (s), and v ∼ N (0, 1), with x and ˆ z defined in (18). [sent-152, score-0.265]

71 5 Analogously to the Replica MMSE Claim, the Replica MAP Claim asserts that asymptotic behavior of the MAP estimate of any single component of x is described by a simple equivalent scalar estimator. [sent-153, score-0.354]

72 In the equivalent scalar model, the component of the true vector x is corrupted by Gaussian noise and the estimate of that component is given by a scalar MAP estimate of the component from the noise-corrupted version. [sent-154, score-0.602]

73 In this section, we first consider choices of f that yield MAP estimators relevant to compressed sensing. [sent-157, score-0.189]

74 We then additionally impose a sparse prior for x for numerical evaluations of asymptotic performance. [sent-158, score-0.146]

75 We first consider the lasso or basis pursuit estimate [13, 14] given by ˆ xlasso (y) = arg min x∈Rn 1 y − AS1/2 x 2γ 2 2 + x 1, (20) where γ > 0 is an algorithm parameter. [sent-160, score-0.22]

76 This estimator is identical to the MAP estimator (13) with the cost function f (x) = |x|. [sent-161, score-0.213]

77 With this cost function, the scalar MAP estimator in (16) is given by soft xmap (z ; λ) = Tλ (z), ˆscalar (21) soft where Tλ (z) is the soft thresholding operator soft Tλ (z) = z − λ, if z > λ; 0, if |z| ≤ λ; z + λ, if z < −λ. [sent-162, score-0.668]

78 Hence, the asymptotic behavior of lasso has a remarkably simple description: the asymptotic distribution of the lasso estimate xj of the ˆ component xj is identical to xj being corrupted by Gaussian noise and then soft-thresholded to yield the estimate xj . [sent-164, score-0.96]

79 ˆ To calculate the effective noise levels, one can perform a simple calculation to show that σ 2 (z, λ) in (17) is given by λ, if |z| > λ; σ 2 (z, λ) = (24) 0, if |z| ≤ λ. [sent-165, score-0.128]

80 Substituting (21) and (25) into (19), we obtain the fixed-point equations 2 σeff,map γp 2 soft = σ0 + βE s|x − Tλp (z)|2 (26a) = γ + βγp Pr(|z| > γp /s), (26b) where the expectations are taken with respect to x ∼ p0 (x), s ∼ pS (s), and z in (23). [sent-167, score-0.109]

81 Lasso can be regarded as a convex relaxation of zero normregularized estimation ˆ xzero (y) = arg min x∈Rn 1 y − AS1/2 x 2γ 2 2 + x 0, (27) where x 0 is the number of nonzero components of x. [sent-170, score-0.168]

82 For certain strictly sparse priors, zero norm-regularized estimation may provide better performance than lasso. [sent-171, score-0.116]

83 While computing the zero norm-regularized estimate is generally very difficult, we can use the replica analysis to provide a simple characterization of its performance. [sent-172, score-0.801]

84 The zero norm-regularized estimator is identical to the MAP estimator (13) with the cost function 0, if x = 0; 1, if x = 0. [sent-174, score-0.235]

85 With f (x) given by (28), the scalar MAP estimator in (16) is given by √ xmap (z ; λ) = Tthard(z), ˆscalar t = 2λ, (29) where Tthard is the hard thresholding operator, Tthard (z) = z, if |z| > t; 0, if |z| ≤ t. [sent-179, score-0.471]

86 (32) Thus, the zero norm-regularized estimation of a vector x is equivalent to n scalar components cor2 rupted by some effective noise level σeff,map and hard-thresholded based on a effective noise level γp . [sent-181, score-0.575]

87 2 The fixed-point equations for the effective noise levels σeff,map and γp can be computed similarly to the case of lasso. [sent-182, score-0.217]

88 Plotted is the median normalized SE for various sparse recovery algorithms: linear MMSE estimation, lasso, zero normregularized estimation, and optimal MMSE estimation. [sent-192, score-0.176]

89 Solid lines show the asymptotic predicted MSE from the Replica MAP Claim. [sent-193, score-0.081]

90 For the linear and lasso estimators, the circles and triangles show the actual median SE over 1000 Monte Carlo simulations. [sent-194, score-0.132]

91 For each problem size, we simulated the lasso and linear MMSE estimators over 1000 independent instances with noise levels chosen such that the SNR with perfect side information is 10 dB. [sent-207, score-0.339]

92 The simulated performance is matched very closely by the asymptotic values predicted by the replica analysis. [sent-210, score-0.827]

93 (Analysis of the linear MMSE estimator using the Replica MAP Claim is detailed in [1]; the Replica MMSE Claim is also applicable to this estimator. [sent-211, score-0.098]

94 ) In addition, the replica analysis can be applied to zero norm-regularized and optimal MMSE estimators that are computationally infeasible for large problems. [sent-212, score-0.871]

95 1, illustrating the potential of the replica method to quantify the precise performance losses of practical algorithms. [sent-214, score-0.746]

96 Additional numerical simulations in [1] illustrate convergence to the replica MAP limit, applicability to discrete distributions for x, effects of power variations in the components, and accurate prediction of the probability of sparsity pattern recovery. [sent-215, score-0.797]

97 6 Conclusions We have shown that the behavior of vector MAP estimators with large random measurement matrices and Gaussian noise asymptotically matches that of a set of decoupled scalar estimation problems. [sent-216, score-0.488]

98 We believe that this equivalence to a simple scalar model will open up numerous doors for analysis, particularly in problems of interest in compressed sensing. [sent-217, score-0.245]

99 Asymptotic analysis of MAP estimation via the replica method and applications to compressed sensing. [sent-226, score-0.882]

100 Statistical physics of spin glasses and information processing: An introduction. [sent-242, score-0.105]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('replica', 0.746), ('mmse', 0.306), ('xmap', 0.181), ('scalar', 0.159), ('postulated', 0.153), ('ppost', 0.143), ('verd', 0.143), ('map', 0.141), ('claim', 0.127), ('estimators', 0.103), ('lasso', 0.1), ('estimator', 0.098), ('mse', 0.093), ('xj', 0.09), ('noise', 0.09), ('guo', 0.089), ('compressed', 0.086), ('asymptotic', 0.081), ('tthard', 0.079), ('ps', 0.074), ('pu', 0.068), ('px', 0.067), ('lim', 0.063), ('post', 0.052), ('spin', 0.051), ('sj', 0.05), ('pursuit', 0.05), ('estimation', 0.05), ('sensing', 0.049), ('cdma', 0.048), ('normregularized', 0.048), ('xmmse', 0.048), ('xmpmse', 0.048), ('measurements', 0.046), ('levels', 0.046), ('recovery', 0.045), ('equations', 0.043), ('soft', 0.04), ('measurement', 0.04), ('effective', 0.038), ('thresholding', 0.033), ('estimate', 0.033), ('median', 0.032), ('mpmse', 0.032), ('limits', 0.031), ('physics', 0.031), ('xu', 0.029), ('behavior', 0.029), ('sparse', 0.029), ('components', 0.029), ('component', 0.029), ('alyson', 0.028), ('rangan', 0.028), ('uf', 0.028), ('xpx', 0.028), ('dx', 0.027), ('expectations', 0.026), ('fletcher', 0.025), ('tanaka', 0.025), ('corrupted', 0.024), ('donoho', 0.024), ('edwards', 0.024), ('glasses', 0.023), ('asserts', 0.023), ('satisfy', 0.022), ('zero', 0.022), ('pr', 0.022), ('level', 0.021), ('february', 0.021), ('laws', 0.021), ('snr', 0.021), ('operator', 0.02), ('gaussian', 0.02), ('prior', 0.019), ('arg', 0.019), ('rn', 0.019), ('minimizer', 0.019), ('index', 0.019), ('basis', 0.018), ('variations', 0.018), ('numerical', 0.017), ('cand', 0.017), ('vector', 0.017), ('cost', 0.017), ('january', 0.016), ('se', 0.016), ('sparsity', 0.016), ('november', 0.016), ('assumptions', 0.016), ('ieee', 0.016), ('posteriori', 0.016), ('expressions', 0.016), ('theory', 0.015), ('certain', 0.015), ('priors', 0.015), ('deterministic', 0.015), ('rm', 0.014), ('arxiv', 0.014), ('april', 0.014), ('nonlinear', 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

Author: Sundeep Rangan, Vivek Goyal, Alyson K. Fletcher

Abstract: The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.

2 0.091283783 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

Author: Sundeep Rangan, Alyson K. Fletcher

Abstract: A well-known analysis of Tropp and Gilbert shows that orthogonal matching pursuit (OMP) can recover a k-sparse n-dimensional real vector from m = 4k log(n) noise-free linear measurements obtained through a random Gaussian measurement matrix with a probability that approaches one as n → ∞. This work strengthens this result by showing that a lower number of measurements, m = 2k log(n − k), is in fact sufficient for asymptotic recovery. More generally, when the sparsity level satisfies kmin ≤ k ≤ kmax but is unknown, m = 2kmax log(n − kmin ) measurements is sufficient. Furthermore, this number of measurements is also sufficient for detection of the sparsity pattern (support) of the vector with measurement errors provided the signal-to-noise ratio (SNR) scales to infinity. The scaling m = 2k log(n − k) exactly matches the number of measurements required by the more complex lasso method for signal recovery in a similar SNR scaling.

3 0.079293855 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

Author: Shuheng Zhou

Abstract: Given n noisy samples with p dimensions, where n ≪ p, we show that the multistep thresholding procedure can accurately estimate a sparse vector β ∈ Rp in a linear model, under the restricted eigenvalue conditions (Bickel-Ritov-Tsybakov 09). Thus our conditions for model selection consistency are considerably weaker than what has been achieved in previous works. More importantly, this method allows very significant values of s, which is the number of non-zero elements in the true parameter. For example, it works for cases where the ordinary Lasso would have failed. Finally, we show that if X obeys a uniform uncertainty principle and if the true parameter is sufficiently sparse, the Gauss-Dantzig selector (Cand` se Tao 07) achieves the ℓ2 loss within a logarithmic factor of the ideal mean square error one would achieve with an oracle which would supply perfect information about which coordinates are non-zero and which are above the noise level, while selecting a sufficiently sparse model. 1

4 0.075395942 35 nips-2009-Approximating MAP by Compensating for Structural Relaxations

Author: Arthur Choi, Adnan Darwiche

Abstract: We introduce a new perspective on approximations to the maximum a posteriori (MAP) task in probabilistic graphical models, that is based on simplifying a given instance, and then tightening the approximation. First, we start with a structural relaxation of the original model. We then infer from the relaxation its deficiencies, and compensate for them. This perspective allows us to identify two distinct classes of approximations. First, we find that max-product belief propagation can be viewed as a way to compensate for a relaxation, based on a particular idealized case for exactness. We identify a second approach to compensation that is based on a more refined idealized case, resulting in a new approximation with distinct properties. We go on to propose a new class of algorithms that, starting with a relaxation, iteratively seeks tighter approximations. 1

5 0.07482259 10 nips-2009-A Gaussian Tree Approximation for Integer Least-Squares

Author: Jacob Goldberger, Amir Leshem

Abstract: This paper proposes a new algorithm for the linear least squares problem where the unknown variables are constrained to be in a finite set. The factor graph that corresponds to this problem is very loopy; in fact, it is a complete graph. Hence, applying the Belief Propagation (BP) algorithm yields very poor results. The algorithm described here is based on an optimal tree approximation of the Gaussian density of the unconstrained linear system. It is shown that even though the approximation is not directly applied to the exact discrete distribution, applying the BP algorithm to the modified factor graph outperforms current methods in terms of both performance and complexity. The improved performance of the proposed algorithm is demonstrated on the problem of MIMO detection.

6 0.068357922 157 nips-2009-Multi-Label Prediction via Compressed Sensing

7 0.067538433 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

8 0.065495238 55 nips-2009-Compressed Least-Squares Regression

9 0.064461954 43 nips-2009-Bayesian estimation of orientation preference maps

10 0.063565955 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

11 0.062332876 207 nips-2009-Robust Nonparametric Regression with Metric-Space Valued Output

12 0.057331409 64 nips-2009-Data-driven calibration of linear estimators with minimal penalties

13 0.056640293 173 nips-2009-Nonparametric Greedy Algorithms for the Sparse Learning Problem

14 0.051071659 59 nips-2009-Construction of Nonparametric Bayesian Models from Parametric Bayes Equations

15 0.050081156 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

16 0.047024999 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

17 0.045918513 15 nips-2009-A Rate Distortion Approach for Semi-Supervised Conditional Random Fields

18 0.04471989 87 nips-2009-Exponential Family Graph Matching and Ranking

19 0.044457078 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors

20 0.044170015 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.116), (1, 0.052), (2, 0.023), (3, 0.054), (4, -0.008), (5, -0.055), (6, 0.071), (7, -0.101), (8, 0.086), (9, -0.014), (10, -0.005), (11, 0.019), (12, -0.002), (13, 0.024), (14, 0.036), (15, 0.051), (16, -0.047), (17, -0.001), (18, -0.015), (19, 0.014), (20, -0.03), (21, 0.017), (22, 0.036), (23, 0.068), (24, 0.022), (25, 0.059), (26, 0.027), (27, -0.042), (28, -0.051), (29, 0.01), (30, -0.149), (31, 0.141), (32, -0.016), (33, 0.082), (34, 0.033), (35, 0.035), (36, 0.086), (37, 0.027), (38, 0.032), (39, 0.005), (40, 0.051), (41, 0.054), (42, 0.023), (43, -0.091), (44, -0.045), (45, -0.005), (46, 0.05), (47, 0.019), (48, -0.051), (49, -0.001)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93398494 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

Author: Sundeep Rangan, Vivek Goyal, Alyson K. Fletcher

Abstract: The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.

2 0.72289693 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

Author: Sundeep Rangan, Alyson K. Fletcher

Abstract: A well-known analysis of Tropp and Gilbert shows that orthogonal matching pursuit (OMP) can recover a k-sparse n-dimensional real vector from m = 4k log(n) noise-free linear measurements obtained through a random Gaussian measurement matrix with a probability that approaches one as n → ∞. This work strengthens this result by showing that a lower number of measurements, m = 2k log(n − k), is in fact sufficient for asymptotic recovery. More generally, when the sparsity level satisfies kmin ≤ k ≤ kmax but is unknown, m = 2kmax log(n − kmin ) measurements is sufficient. Furthermore, this number of measurements is also sufficient for detection of the sparsity pattern (support) of the vector with measurement errors provided the signal-to-noise ratio (SNR) scales to infinity. The scaling m = 2k log(n − k) exactly matches the number of measurements required by the more complex lasso method for signal recovery in a similar SNR scaling.

3 0.59153843 105 nips-2009-Grouped Orthogonal Matching Pursuit for Variable Selection and Prediction

Author: Grzegorz Swirszcz, Naoki Abe, Aurelie C. Lozano

Abstract: We consider the problem of variable group selection for least squares regression, namely, that of selecting groups of variables for best regression performance, leveraging and adhering to a natural grouping structure within the explanatory variables. We show that this problem can be efficiently addressed by using a certain greedy style algorithm. More precisely, we propose the Group Orthogonal Matching Pursuit algorithm (Group-OMP), which extends the standard OMP procedure (also referred to as “forward greedy feature selection algorithm” for least squares regression) to perform stage-wise group variable selection. We prove that under certain conditions Group-OMP can identify the correct (groups of) variables. We also provide an upperbound on the l∞ norm of the difference between the estimated regression coefficients and the true coefficients. Experimental results on simulated and real world datasets indicate that Group-OMP compares favorably to Group Lasso, OMP and Lasso, both in terms of variable selection and prediction accuracy. 1

4 0.59020776 173 nips-2009-Nonparametric Greedy Algorithms for the Sparse Learning Problem

Author: Han Liu, Xi Chen

Abstract: This paper studies the forward greedy strategy in sparse nonparametric regression. For additive models, we propose an algorithm called additive forward regression; for general multivariate models, we propose an algorithm called generalized forward regression. Both algorithms simultaneously conduct estimation and variable selection in nonparametric settings for the high dimensional sparse learning problem. Our main emphasis is empirical: on both simulated and real data, these two simple greedy methods can clearly outperform several state-ofthe-art competitors, including LASSO, a nonparametric version of LASSO called the sparse additive model (SpAM) and a recently proposed adaptive parametric forward-backward algorithm called Foba. We also provide some theoretical justifications of specific versions of the additive forward regression. 1

5 0.58895665 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

Author: Shuheng Zhou

Abstract: Given n noisy samples with p dimensions, where n ≪ p, we show that the multistep thresholding procedure can accurately estimate a sparse vector β ∈ Rp in a linear model, under the restricted eigenvalue conditions (Bickel-Ritov-Tsybakov 09). Thus our conditions for model selection consistency are considerably weaker than what has been achieved in previous works. More importantly, this method allows very significant values of s, which is the number of non-zero elements in the true parameter. For example, it works for cases where the ordinary Lasso would have failed. Finally, we show that if X obeys a uniform uncertainty principle and if the true parameter is sufficiently sparse, the Gauss-Dantzig selector (Cand` se Tao 07) achieves the ℓ2 loss within a logarithmic factor of the ideal mean square error one would achieve with an oracle which would supply perfect information about which coordinates are non-zero and which are above the noise level, while selecting a sufficiently sparse model. 1

6 0.52375144 10 nips-2009-A Gaussian Tree Approximation for Integer Least-Squares

7 0.51568425 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

8 0.49772933 35 nips-2009-Approximating MAP by Compensating for Structural Relaxations

9 0.49336553 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors

10 0.49235579 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

11 0.48268241 157 nips-2009-Multi-Label Prediction via Compressed Sensing

12 0.4639639 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

13 0.46396026 7 nips-2009-A Data-Driven Approach to Modeling Choice

14 0.42832002 55 nips-2009-Compressed Least-Squares Regression

15 0.42292932 138 nips-2009-Learning with Compressible Priors

16 0.40762734 79 nips-2009-Efficient Recovery of Jointly Sparse Vectors

17 0.3982079 108 nips-2009-Heterogeneous multitask learning with joint sparsity constraints

18 0.38637117 207 nips-2009-Robust Nonparametric Regression with Metric-Space Valued Output

19 0.38035759 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

20 0.36967632 43 nips-2009-Bayesian estimation of orientation preference maps


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(24, 0.041), (25, 0.082), (35, 0.031), (36, 0.101), (39, 0.027), (54, 0.281), (58, 0.118), (61, 0.04), (71, 0.054), (77, 0.032), (81, 0.02), (86, 0.049), (91, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.75738078 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

Author: Sundeep Rangan, Vivek Goyal, Alyson K. Fletcher

Abstract: The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.

2 0.75406706 189 nips-2009-Periodic Step Size Adaptation for Single Pass On-line Learning

Author: Chun-nan Hsu, Yu-ming Chang, Hanshen Huang, Yuh-jye Lee

Abstract: It has been established that the second-order stochastic gradient descent (2SGD) method can potentially achieve generalization performance as well as empirical optimum in a single pass (i.e., epoch) through the training examples. However, 2SGD requires computing the inverse of the Hessian matrix of the loss function, which is prohibitively expensive. This paper presents Periodic Step-size Adaptation (PSA), which approximates the Jacobian matrix of the mapping function and explores a linear relation between the Jacobian and Hessian to approximate the Hessian periodically and achieve near-optimal results in experiments on a wide variety of models and tasks. 1

3 0.74168813 168 nips-2009-Non-stationary continuous dynamic Bayesian networks

Author: Marco Grzegorczyk, Dirk Husmeier

Abstract: Dynamic Bayesian networks have been applied widely to reconstruct the structure of regulatory processes from time series data. The standard approach is based on the assumption of a homogeneous Markov chain, which is not valid in many realworld scenarios. Recent research efforts addressing this shortcoming have considered undirected graphs, directed graphs for discretized data, or over-flexible models that lack any information sharing among time series segments. In the present article, we propose a non-stationary dynamic Bayesian network for continuous data, in which parameters are allowed to vary among segments, and in which a common network structure provides essential information sharing across segments. Our model is based on a Bayesian multiple change-point process, where the number and location of the change-points is sampled from the posterior distribution. 1

4 0.67357087 45 nips-2009-Beyond Convexity: Online Submodular Minimization

Author: Elad Hazan, Satyen Kale

Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. 1

5 0.56234622 254 nips-2009-Variational Gaussian-process factor analysis for modeling spatio-temporal data

Author: Jaakko Luttinen, Alexander T. Ihler

Abstract: We present a probabilistic factor analysis model which can be used for studying spatio-temporal datasets. The spatial and temporal structure is modeled by using Gaussian process priors both for the loading matrix and the factors. The posterior distributions are approximated using the variational Bayesian framework. High computational cost of Gaussian process modeling is reduced by using sparse approximations. The model is used to compute the reconstructions of the global sea surface temperatures from a historical dataset. The results suggest that the proposed model can outperform the state-of-the-art reconstruction systems.

6 0.55970091 22 nips-2009-Accelerated Gradient Methods for Stochastic Optimization and Online Learning

7 0.55861849 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

8 0.55790091 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

9 0.55633479 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

10 0.55488062 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

11 0.55428743 37 nips-2009-Asymptotically Optimal Regularization in Smooth Parametric Models

12 0.55349082 215 nips-2009-Sensitivity analysis in HMMs with application to likelihood maximization

13 0.55256492 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

14 0.55163062 97 nips-2009-Free energy score space

15 0.55002338 207 nips-2009-Robust Nonparametric Regression with Metric-Space Valued Output

16 0.54851735 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

17 0.5484786 64 nips-2009-Data-driven calibration of linear estimators with minimal penalties

18 0.54826736 158 nips-2009-Multi-Label Prediction via Sparse Infinite CCA

19 0.54798973 174 nips-2009-Nonparametric Latent Feature Models for Link Prediction

20 0.54791534 169 nips-2009-Nonlinear Learning using Local Coordinate Coding