nips nips2013 nips2013-212 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Haichao Zhang, David Wipf
Abstract: Typical blur from camera shake often deviates from the standard uniform convolutional assumption, in part because of problematic rotations which create greater blurring away from some unknown center point. Consequently, successful blind deconvolution for removing shake artifacts requires the estimation of a spatiallyvarying or non-uniform blur operator. Using ideas from Bayesian inference and convex analysis, this paper derives a simple non-uniform blind deblurring algorithm with a spatially-adaptive image penalty. Through an implicit normalization process, this penalty automatically adjust its shape based on the estimated degree of local blur and image structure such that regions with large blur or few prominent edges are discounted. Remaining regions with modest blur and revealing edges therefore dominate on average without explicitly incorporating structureselection heuristics. The algorithm can be implemented using an optimization strategy that is virtually tuning-parameter free and simpler than existing methods, and likely can be applied in other settings such as dictionary learning. Detailed theoretical analysis and empirical comparisons on real images serve as validation.
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract Typical blur from camera shake often deviates from the standard uniform convolutional assumption, in part because of problematic rotations which create greater blurring away from some unknown center point. [sent-3, score-1.056]
2 Consequently, successful blind deconvolution for removing shake artifacts requires the estimation of a spatiallyvarying or non-uniform blur operator. [sent-4, score-1.113]
3 Using ideas from Bayesian inference and convex analysis, this paper derives a simple non-uniform blind deblurring algorithm with a spatially-adaptive image penalty. [sent-5, score-0.88]
4 Through an implicit normalization process, this penalty automatically adjust its shape based on the estimated degree of local blur and image structure such that regions with large blur or few prominent edges are discounted. [sent-6, score-1.572]
5 Remaining regions with modest blur and revealing edges therefore dominate on average without explicitly incorporating structureselection heuristics. [sent-7, score-0.64]
6 1 Introduction Image blur is an undesirable degradation that often accompanies the image formation process and may arise, for example, because of camera shake during acquisition. [sent-10, score-1.122]
7 Blind image deblurring strategies aim to recover a sharp image from only a blurry, compromised observation. [sent-11, score-0.916]
8 Unfortunately, many real-world photographs contain blur effects that vary across the image plane, such as when unknown rotations are introduced by camera shake [17]. [sent-13, score-1.119]
9 Note that the original uniform blur model can be achieved equivalently when H is forced to adopt certain structure (e. [sent-15, score-0.557]
10 In general, nonuniform blur may arise under several different contexts. [sent-18, score-0.528]
11 This paper will focus on the blind removal of non-uniform blur caused by general camera shake (as opposed to blur from object motion) using only a single image, with no additional hardware assistance. [sent-19, score-1.737]
12 While existing algorithms for addressing non-uniform camera shake have displayed a measure of success, several important limitations remain. [sent-20, score-0.367]
13 First, some methods require either additional spe1 cialized hardware such as high-speed video capture [23] or inertial measurement sensors [13] for estimating motion, or else multiple images of the same scene [4]. [sent-21, score-0.121]
14 Secondly, even the algorithms that operate given only data from a single image typically rely on carefully engineered initializations, heuristics, and trade-off parameters for selecting salient image structure or edges, in part to avoid undesirable degenerate, no-blur solutions [7, 8, 9, 11]. [sent-22, score-0.469]
15 This transparency leads to a nearly tuning-parameter free algorithm based upon a sparsity penalty whose shape adapts to the estimated degree of local blur, and provides theoretical arguments regarding how to robustly handle non-uniform degradations. [sent-25, score-0.236]
16 Section 2 briefly describes relevant existing work on non-uniform blind deblurring operators and implementation techniques. [sent-27, score-0.729]
17 Section 3 then introduces the proposed non-uniform blind deblurring model, while further theoretical justification and analyses are provided in Section 4. [sent-28, score-0.692]
18 The downside with this type of model is that geometric relationships between the blur kernels of different regions derived from the the physical motion path of the camera are ignored. [sent-32, score-0.86]
19 In this regard, (1) represents a more general model that has been used in many recent non-uniform deblurring efforts [23, 25, 7, 11, 4]. [sent-35, score-0.452]
20 PMP also retains the bilinear property of uniform convolution, meaning that y = Hx + n = Dw + n, where H = j (2) wj Pj and D = [P1 x, P2 x, · · · , Pj x, · · · ] is a matrix of transformed sharp images. [sent-36, score-0.193]
21 However, EFF can be combined with the PMP model by introducing a set of basis images efficiently generated by transforming a grid of delta peak images [9]. [sent-38, score-0.142]
22 3 A New Non-Uniform Blind Deblurring Model Following previous work [6, 16], we will work in the derivative domain of images for ease of modeling and better performance, meaning that x ∈ R m and y ∈ Rn will denote the lexicographically ordered sharp and blurry image derivatives respectively. [sent-40, score-0.504]
23 While presently γ is unknown, if we first marginalize over the unknown x, we can estimate it jointly along with the blur parameters w and the unknown noise variance λ. [sent-52, score-0.582]
24 The final sharp image can then be recovered using the estimated kernel and noise level along with standard non-blind deblurring algorithms (e. [sent-54, score-0.839]
25 For example, consider the simplified non-uniform deblurring situation where the true x has a single non-zero element and H is defined such that each column indexed by i is independently parameterized with finite support symmetric around pixel i. [sent-61, score-0.486]
26 Moreover, assume this support matches the true support of the unknown blur operator. [sent-62, score-0.528]
27 Then we have the following: Lemma 1 Given the idealized non-uniform deblurring problem described above, the cost function (4) will be characterized by a unique minimizing solution that correctly locates the nonzero element in x and the corresponding true blur kernel at this location. [sent-63, score-1.092]
28 Specifically, using Hadamard’s inequality we have = n log λ + log |Γ| + log λ−1 HT H + Γ−1 ≤ log HΓHT + λI n log λ + log |Γ| + log λ−1 diag HT H + Γ−1 ¯ log λ + γi wi = 2 2 + (n − m) log λ, (6) i ¯ where wi denotes the i-th column of H. [sent-71, score-0.534]
29 We then have log |λ−1 HT H + Γ−1 | = 2 log |V| ≤ 2 log ( i vi 2 ) = log diag λ−1 HT H + Γ−1 , leading to the stated result. [sent-76, score-0.104]
30 3 ¯ Also, the quantity wi 2 which appears in (6) can be viewed as a measure of the degree of local blur at location i. [sent-77, score-0.733]
31 Given the feasible region w ≥ 0 and without loss of generality the constraint ¯ 2 wi 2 ≤ 1, where i wi = 1 for normalization purposes, it can easily be shown that 1/L ≤ ¯ L is the maximum number of elements in any local blur kernel wi or column of H. [sent-78, score-1.099]
32 The upper bound is achieved when the local kernel is a delta solution, meaning only one nonzero element ¯ 2 and therefore minimal blur. [sent-79, score-0.165]
33 In contrast, the lower bound on wi 2 occurs when every element of ¯ wi has an equal value, constituting the maximal possible blur. [sent-80, score-0.266]
34 This metric, which will influence ¯ 2 our analysis in the next section, can be computing using wi 2 = wT (BT Bi )w, where Bi i [P1 ei , P2 ei , · · · , Pj ei , · · · ] and ei denotes an all-zero image with a one at site i. [sent-81, score-0.441]
35 In the uniform ¯ deblurring case, B T Bi = I ignoring edge effects, and therefore wi 2 = w 2 for all i. [sent-82, score-0.614]
36 However, it is still well-equipped for estimating sparse image gradients and avoiding degenerate no-blur solutions. [sent-84, score-0.293]
37 For example, consider the case of an asymptotically large image with iid distributed sparse image gradients, with some constant fraction exactly equal to zero and the remaining nonzero elements drawn from any continuous distribution. [sent-85, score-0.42]
38 Now suppose that this image is corrupted with a non-uniform blur operator of the form H = j wj Pj , where the cardinality of the summation is finite and H satisfies minimal regularity conditions. [sent-86, score-0.793]
39 Then it can be shown that any global minimum of (4), with or without the bound from (6), will produce the true blur operator. [sent-87, score-0.528]
40 Related intuition applies when noise is present or when the image gradients are not exactly sparse (we will defer more detailed analysis to a future publication). [sent-88, score-0.287]
41 Regardless, the simplified γ-dependent cost function is still far less intuitive than the penalized regression models dependent on x such as (5) that are typically employed for non-uniform blind deblurring. [sent-89, score-0.297]
42 However, using the framework from [26], it can be shown that the kernel estimate obtained by this process is formally equivalent to the one obtained via 1 ¯ min ψ(|xi | wi 2 , λ) + (n − m) log λ, with (7) y − Hx 2 + 2 x;w≥0,λ≥0 λ i 2u √ u ≥ 0. [sent-90, score-0.218]
43 ψ(u, λ) + log 2λ + u2 + u 4λ + u2 u + 4λ + u2 The optimization from (7) closely resembles a standard penalized regression (or equivalently MAP) problem used for blind deblurring. [sent-91, score-0.296]
44 The primary distinction is the penalty term ψ, which jointly regularizes x, w, and λ as discussed Section 4. [sent-92, score-0.106]
45 The underlying procedure is related to variational Bayesian (VB) models from [1, 16, 20]; however, these models are based on a completely different mean-field approximation and a uniform blur assumption, and they do not learn the noise parameter. [sent-94, score-0.583]
46 4 Model Properties The proposed blind deblurring strategy involves simply minimizing (7); no additional steps for tradeoff parameter selection or structure/salient-edge detection are required unlike other state-of-the-art approaches. [sent-96, score-0.692]
47 First, we will demonstrate a form of intrinsic column normalization that facilitates the balanced sparse estimation of the unknown latent image and implicitly de-emphasizes regions with large blur and few dominate edges. [sent-98, score-0.952]
48 1 Column-Normalized Sparse Estimation ¯ Using the simple reparameterization z i xi wi 1 2 y − Hz 2 + min z;w≥0,λ≥0 λ 2 it follows that (7) is exactly equivalent to solving ψ(|zi |, λ) + (n − m) log λ, i 4 (8) where z = [z1 , . [sent-102, score-0.213]
49 Moreover, it can be shown that this ψ is a concave, non-decreasing function of |z|, and hence represents a canonical sparsity-promoting penalty function with respect to z [26]. [sent-106, score-0.106]
50 Moreover, no additional heuristic kernel penalty terms such as in (5) are required since H is in some sense self-regularized by the normalization. [sent-110, score-0.165]
51 While this will indeed result in normalized columns and a properly balanced data-fit term, these raw norms will now appear in the penalty function g, giving the equivalent objective min z;w≥0 y − Hz 2 2 ¯ g z i wi +α −1 2 +β i h(wj ). [sent-114, score-0.283]
52 ¯ Simply put, the problem (9) will favor solutions where the ratio z i / wi 2 is sparse or nearly so, ¯ which can be achieved by either making many z i zero or many wi 2 big. [sent-116, score-0.31]
53 If some z i is estimated to be zero (and many z i will provably be exactly zero at any local minima if g(x) is a concave, ¯ non-decreasing function of |x|), then the corresponding wi 2 will be unconstrained. [sent-117, score-0.244]
54 In contrast, ¯ if a given zi is non-zero, there will be a stronger push for the associated wi 2 to be large, i. [sent-118, score-0.133]
55 Thus, the relative penalization of the kernel norms will depend on the estimated local image gradients, and no-blur delta solutions may be arbitrarily favored in parts of the image plane dominated by edges, the very place where blur estimation information is paramount. [sent-121, score-1.204]
56 ¯ In reality, the local kernel norms wi 2 , which quantify the degree of local blur as mentioned previously, should be completely independent of the sparsity of the image gradients in the same location. [sent-122, score-1.119]
57 This is of course because the different blurring effects from camera shake are independent of the locations of strong edges in a given scene, since the blur operator is only a function of camera motion (at least to first order approximation). [sent-123, score-1.293]
58 One way to compensate for this independence would be ¯ to simply optimize (9) with wi 2 removed from g. [sent-124, score-0.133]
59 In contrast, our algorithm handles these complications seamlessly without any additional penalty terms. [sent-127, score-0.106]
60 2 Noise-Dependent, Parameter-Free Homotopy Continuation Column normalization can be viewed as a principled first step towards solving challenging sparse estimation problems. [sent-129, score-0.118]
61 However, when non-convex sparse regularizers are used for the image penalty, e. [sent-130, score-0.232]
62 , p norms with p < 1, then local minima can be a significant problem. [sent-132, score-0.129]
63 When applied to a sharp image, any blur operator will necessarily contribute two opposing effects: (i) It reduces a measure of the image sparsity, which normally increases the penalty i |yi |p , and p (ii) It broadly reduces the overall image variance, which actually reduces i |yi | . [sent-134, score-1.127]
64 Note that we can always apply greater and greater blur to any sharp image x such that the variance of the resulting blurry y is arbitrarily small. [sent-136, score-1.027]
65 This then produces an arbitrarily small p norm, which implies that p p i |yi | < i |xi | , meaning that the penalty actually favors the blurry image over the sharp one. [sent-137, score-0.561]
66 In a practical sense though, the amount of blur that can be tolerated before this undesirable preference for y over x occurs is much larger as p approaches zero. [sent-138, score-0.567]
67 This is because the more concave the image penalty becomes (as a function of coefficient magnitudes), the less sensitive it is to image variance and the more sensitive it is to image sparsity. [sent-139, score-0.752]
68 2 We may therefore expect such a highly concave, sparsity promoting penalty to favor the sharp image over the blurry one in a broader range of blur conditions. [sent-141, score-1.12]
69 Even with other families of penalty functions the same basic notion holds: greater concavity means greater sparsity preference and less sensitivity to variance changes that favor no-blur degenerate solutions. [sent-142, score-0.345]
70 From an implementational standpoint, homotopy continuation methods provide one attractive means of dealing with difficult non-convex penalty functions and the associated constellation of local minima [3]. [sent-143, score-0.375]
71 The basic idea is to use a parameterized family of sparsity-promoting functions g(x; θ), where different values of θ determine the relative degree of concavity allowing a transition from something convex such as the 1 norm (with θ large) to something concave such as the 0 norm (with θ small). [sent-144, score-0.25]
72 While potentially effective in practice, homotopy continuation methods require both a trade-off parameter for g(x; θ) and a pre-defined schedule or heuristic for adjusting θ, both of which could potentially be image dependent. [sent-147, score-0.335]
73 The proposed deblurring algorithm automatically implements a form of noise-dependent, parameterfree homotopy continuation with several attractive auxiliary properties [26]. [sent-148, score-0.599]
74 To make this claim precise and facilitate subsequent analysis, we first introduce the definition of relative concavity [19]: Definition 1 Let u be a strictly increasing function on [a, b]. [sent-149, score-0.13]
75 The function ν is concave relative to ν u on the interval [a, b] if and only if ν(y) ≤ ν(x) + u (x) [u(y) − u(x)] holds ∀x, y ∈ [a, b]. [sent-150, score-0.109]
76 (x) We will use ν ≺ u to denote that ν is concave relative to u on [0, ∞). [sent-151, score-0.109]
77 This can be understood as a natural generalization of the traditional notion of a concavity, in that a concave function is equivalently concave relative to a linear function per Definition 1. [sent-152, score-0.191]
78 In the context of homotopy continuation, an ideal candidate penalty would be one for which g(x; θ 1 ) ≺ g(x; θ2 ) whenever θ1 ≤ θ2 . [sent-156, score-0.181]
79 This would ensure that greater sparsity-inducing concavity is introduced as θ is reduced. [sent-157, score-0.139]
80 Additionally, because λ must be relatively large to arrive at this 1 approximation, the estimation need only focus on reproducing the largest elements in z since the sparse penalty will dominate the data fit term. [sent-168, score-0.212]
81 Furthermore, these larger elements are on average more likely to be in regions of relatively lower blurring or high ¯ ¯ wi 2 value by virtue of the reparameterization z i = xi wi 2 . [sent-169, score-0.431]
82 Consequently, the less concave ¯ initial estimation can proceed successfully by de-emphasizing regions with high blur or low wi 2 , and focusing on coarsely approximating regions with relatively less blur. [sent-170, score-0.874]
83 2 Note that even if the true sharp image is not exactly sparse, as long as it can be reasonably wellapproximated by some exactly sparse image in an 2 norm sense, then the analysis here still holds [27]. [sent-171, score-0.508]
84 Later as the estimation proceeds and w and z are refined, λ will be reduced which in turn necessarily increases the relative concavity of the penalty ψ per Theorem 1. [sent-174, score-0.265]
85 Eventually the penalty can even approach the 0 norm (although images are generally not exactly sparse, and other noise factors and unmodeled artifacts are usually present such that λ will never go all the way to zero). [sent-176, score-0.218]
86 Figure 1 displays results of this procedure both with and without the spatially-varying column normalizations and the implicit adaptive penalization that help compensate for locally varying image blur. [sent-178, score-0.259]
87 The supplementary file contains a number of additional comparisons, including assessments with a benchmark uniform blind deblurring dataset where ground truth is available. [sent-180, score-0.721]
88 Figure 2 displays deblurring comparisons based on the Butchershop and Vintage-car images. [sent-189, score-0.478]
89 Note that with these images, ground truth blur kernels were independently estimated using a special capturing process [8]. [sent-191, score-0.554]
90 As shown in the supplementary file, the estimated blur kernel patterns obtained from our algorithm better resemble the ground truth relative to the other methods, a performance result that compensates for any differences in the non-blind step. [sent-192, score-0.64]
91 [25]: Results on the Pantheon test image from [25] are shown in Figure 3 (top row), where we observe that the deblurred image from Whyte et al. [sent-194, score-0.413]
92 [7]: We next experiment using the test image Building from [7], which contains large rotational blurring that can be challenging for blind deblurring algorithms. [sent-198, score-0.94]
93 presents a deblurring algorithm that relies upon additional hardware for estimating camera motion [13]. [sent-202, score-0.768]
94 However, even without this additional in7 Butchershop Vintage-car B LURRY H ARMELING H IRSCH O UR B LURRY H ARMELING H IRSCH O UR Pantheon Figure 2: Non-uniform deblurring results. [sent-203, score-0.452]
95 (better viewed electronically with zooming) W HYTE O UR B LURRY G UPTA O UR B LURRY J OSHI O UR Sculpture Building B LURRY Figure 3: Non-uniform deblurring results. [sent-205, score-0.489]
96 (better viewed electronically with zooming) formation, our algorithm produces a better sharp estimate of the Sculpture image from [13], with fewer ringing artifacts and higher resolution details. [sent-207, score-0.406]
97 6 Conclusion This paper presents a strikingly simple yet effective method for non-uniform camera shake removal based upon a principled, transparent cost function that is open to analysis and further extensions/refinements. [sent-209, score-0.472]
98 One √ ¯ such simple example is a penalty of the form i log( λ + |xi | wi 2 ); many others are possible. [sent-213, score-0.239]
99 Richardson-Lucy deblurring for scenes under a projective motion path. [sent-386, score-0.551]
100 Multi-image blind deblurring using a coupled adaptive sparse prior. [sent-430, score-0.736]
wordName wordTfidf (topN-words)
[('blur', 0.528), ('deblurring', 0.452), ('blind', 0.24), ('image', 0.188), ('shake', 0.185), ('camera', 0.182), ('blurry', 0.151), ('wi', 0.133), ('penalty', 0.106), ('concavity', 0.103), ('motion', 0.099), ('deconvolution', 0.094), ('hirsch', 0.093), ('lurry', 0.093), ('wipf', 0.093), ('sharp', 0.088), ('concave', 0.082), ('ht', 0.079), ('joshi', 0.075), ('homotopy', 0.075), ('hx', 0.074), ('harmeling', 0.074), ('pmp', 0.074), ('whyte', 0.074), ('continuation', 0.072), ('pj', 0.063), ('blurring', 0.06), ('kernel', 0.059), ('ur', 0.056), ('ringing', 0.056), ('reparameterization', 0.054), ('siggraph', 0.054), ('regions', 0.051), ('minima', 0.051), ('hz', 0.05), ('images', 0.049), ('durand', 0.049), ('wj', 0.048), ('normalization', 0.045), ('sparse', 0.044), ('norms', 0.044), ('delta', 0.044), ('gupta', 0.044), ('removal', 0.039), ('undesirable', 0.039), ('levin', 0.039), ('transparent', 0.039), ('degree', 0.038), ('penalization', 0.037), ('artifacts', 0.037), ('cho', 0.037), ('armeling', 0.037), ('butchershop', 0.037), ('deblurred', 0.037), ('eff', 0.037), ('electronically', 0.037), ('implementational', 0.037), ('inertial', 0.037), ('irsch', 0.037), ('pantheon', 0.037), ('sculpture', 0.037), ('zooming', 0.037), ('operators', 0.037), ('rotations', 0.036), ('greater', 0.036), ('hardware', 0.035), ('column', 0.034), ('local', 0.034), ('dominate', 0.033), ('zitnick', 0.033), ('script', 0.033), ('cvpr', 0.032), ('sparsity', 0.032), ('degenerate', 0.032), ('penalized', 0.03), ('ei', 0.03), ('estimation', 0.029), ('spatially', 0.029), ('uniform', 0.029), ('operator', 0.029), ('consequently', 0.029), ('gradients', 0.029), ('china', 0.029), ('engineered', 0.028), ('presently', 0.028), ('uncovered', 0.028), ('tai', 0.028), ('edges', 0.028), ('meaning', 0.028), ('relative', 0.027), ('cost', 0.027), ('favoring', 0.027), ('promoting', 0.027), ('estimated', 0.026), ('log', 0.026), ('noise', 0.026), ('salient', 0.026), ('comparisons', 0.026), ('asia', 0.026), ('idealized', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 212 nips-2013-Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty
Author: Haichao Zhang, David Wipf
Abstract: Typical blur from camera shake often deviates from the standard uniform convolutional assumption, in part because of problematic rotations which create greater blurring away from some unknown center point. Consequently, successful blind deconvolution for removing shake artifacts requires the estimation of a spatiallyvarying or non-uniform blur operator. Using ideas from Bayesian inference and convex analysis, this paper derives a simple non-uniform blind deblurring algorithm with a spatially-adaptive image penalty. Through an implicit normalization process, this penalty automatically adjust its shape based on the estimated degree of local blur and image structure such that regions with large blur or few prominent edges are discounted. Remaining regions with modest blur and revealing edges therefore dominate on average without explicitly incorporating structureselection heuristics. The algorithm can be implemented using an optimization strategy that is virtually tuning-parameter free and simpler than existing methods, and likely can be applied in other settings such as dictionary learning. Detailed theoretical analysis and empirical comparisons on real images serve as validation.
2 0.15998206 37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
Author: Vikash Mansinghka, Tejas D. Kulkarni, Yura N. Perov, Josh Tenenbaum
Abstract: The idea of computer vision as the Bayesian inverse problem to computer graphics has a long history and an appealing elegance, but it has proved difficult to directly implement. Instead, most vision tasks are approached via complex bottom-up processing pipelines. Here we show that it is possible to write short, simple probabilistic graphics programs that define flexible generative models and to automatically invert them to interpret real-world images. Generative probabilistic graphics programs (GPGP) consist of a stochastic scene generator, a renderer based on graphics software, a stochastic likelihood model linking the renderer’s output and the data, and latent variables that adjust the fidelity of the renderer and the tolerance of the likelihood. Representations and algorithms from computer graphics are used as the deterministic backbone for highly approximate and stochastic generative models. This formulation combines probabilistic programming, computer graphics, and approximate Bayesian computation, and depends only on generalpurpose, automatic inference techniques. We describe two applications: reading sequences of degraded and adversarially obscured characters, and inferring 3D road models from vehicle-mounted camera images. Each of the probabilistic graphics programs we present relies on under 20 lines of probabilistic code, and yields accurate, approximately Bayesian inferences about real-world images. 1
3 0.10016588 31 nips-2013-Adaptivity to Local Smoothness and Dimension in Kernel Regression
Author: Samory Kpotufe, Vikas Garg
Abstract: We present the first result for kernel regression where the procedure adapts locally at a point x to both the unknown local dimension of the metric space X and the unknown H¨ lder-continuity of the regression function at x. The result holds with o high probability simultaneously at all points x in a general metric space X of unknown structure. 1
4 0.097379535 321 nips-2013-Supervised Sparse Analysis and Synthesis Operators
Author: Pablo Sprechmann, Roee Litman, Tal Ben Yakar, Alexander M. Bronstein, Guillermo Sapiro
Abstract: In this paper, we propose a new computationally efficient framework for learning sparse models. We formulate a unified approach that contains as particular cases models promoting sparse synthesis and analysis type of priors, and mixtures thereof. The supervised training of the proposed model is formulated as a bilevel optimization problem, in which the operators are optimized to achieve the best possible performance on a specific task, e.g., reconstruction or classification. By restricting the operators to be shift invariant, our approach can be thought as a way of learning sparsity-promoting convolutional operators. Leveraging recent ideas on fast trainable regressors designed to approximate exact sparse codes, we propose a way of constructing feed-forward networks capable of approximating the learned models at a fraction of the computational cost of exact solvers. In the shift-invariant case, this leads to a principled way of constructing a form of taskspecific convolutional networks. We illustrate the proposed models on several experiments in music analysis and image processing applications. 1
5 0.080043405 59 nips-2013-Blind Calibration in Compressed Sensing using Message Passing Algorithms
Author: Christophe Schulke, Francesco Caltagirone, Florent Krzakala, Lenka Zdeborova
Abstract: Compressed sensing (CS) is a concept that allows to acquire compressible signals with a small number of measurements. As such it is very attractive for hardware implementations. Therefore, correct calibration of the hardware is a central issue. In this paper we study the so-called blind calibration, i.e. when the training signals that are available to perform the calibration are sparse but unknown. We extend the approximate message passing (AMP) algorithm used in CS to the case of blind calibration. In the calibration-AMP, both the gains on the sensors and the elements of the signals are treated as unknowns. Our algorithm is also applicable to settings in which the sensors distort the measurements in other ways than multiplication by a gain, unlike previously suggested blind calibration algorithms based on convex relaxations. We study numerically the phase diagram of the blind calibration problem, and show that even in cases where convex relaxation is possible, our algorithm requires a smaller number of measurements and/or signals in order to perform well. 1
6 0.079266444 226 nips-2013-One-shot learning by inverting a compositional causal process
7 0.063978024 114 nips-2013-Extracting regions of interest from biological images with convolutional sparse block coding
8 0.060203537 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
9 0.054935947 351 nips-2013-What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach
10 0.054626271 251 nips-2013-Predicting Parameters in Deep Learning
11 0.054257147 89 nips-2013-Dimension-Free Exponentiated Gradient
12 0.054033298 300 nips-2013-Solving the multi-way matching problem by permutation synchronization
13 0.053788688 281 nips-2013-Robust Low Rank Kernel Embeddings of Multivariate Distributions
14 0.052571081 167 nips-2013-Learning the Local Statistics of Optical Flow
15 0.051548291 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning
16 0.04978361 356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer
17 0.04881981 260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
18 0.048678845 166 nips-2013-Learning invariant representations and applications to face verification
19 0.048394945 91 nips-2013-Dirty Statistical Models
20 0.04588915 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
topicId topicWeight
[(0, 0.152), (1, 0.068), (2, -0.019), (3, -0.021), (4, 0.038), (5, -0.01), (6, -0.034), (7, 0.012), (8, -0.059), (9, 0.015), (10, -0.049), (11, 0.023), (12, -0.029), (13, 0.002), (14, -0.016), (15, 0.047), (16, -0.037), (17, -0.082), (18, -0.068), (19, -0.019), (20, -0.019), (21, 0.015), (22, 0.03), (23, 0.019), (24, -0.077), (25, -0.003), (26, 0.034), (27, -0.015), (28, 0.023), (29, -0.031), (30, 0.003), (31, 0.077), (32, 0.017), (33, 0.124), (34, -0.031), (35, -0.007), (36, 0.024), (37, 0.008), (38, 0.08), (39, -0.005), (40, 0.079), (41, 0.089), (42, -0.077), (43, -0.107), (44, -0.099), (45, -0.062), (46, 0.063), (47, 0.047), (48, 0.04), (49, -0.041)]
simIndex simValue paperId paperTitle
same-paper 1 0.92623353 212 nips-2013-Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty
Author: Haichao Zhang, David Wipf
Abstract: Typical blur from camera shake often deviates from the standard uniform convolutional assumption, in part because of problematic rotations which create greater blurring away from some unknown center point. Consequently, successful blind deconvolution for removing shake artifacts requires the estimation of a spatiallyvarying or non-uniform blur operator. Using ideas from Bayesian inference and convex analysis, this paper derives a simple non-uniform blind deblurring algorithm with a spatially-adaptive image penalty. Through an implicit normalization process, this penalty automatically adjust its shape based on the estimated degree of local blur and image structure such that regions with large blur or few prominent edges are discounted. Remaining regions with modest blur and revealing edges therefore dominate on average without explicitly incorporating structureselection heuristics. The algorithm can be implemented using an optimization strategy that is virtually tuning-parameter free and simpler than existing methods, and likely can be applied in other settings such as dictionary learning. Detailed theoretical analysis and empirical comparisons on real images serve as validation.
2 0.76009774 37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
Author: Vikash Mansinghka, Tejas D. Kulkarni, Yura N. Perov, Josh Tenenbaum
Abstract: The idea of computer vision as the Bayesian inverse problem to computer graphics has a long history and an appealing elegance, but it has proved difficult to directly implement. Instead, most vision tasks are approached via complex bottom-up processing pipelines. Here we show that it is possible to write short, simple probabilistic graphics programs that define flexible generative models and to automatically invert them to interpret real-world images. Generative probabilistic graphics programs (GPGP) consist of a stochastic scene generator, a renderer based on graphics software, a stochastic likelihood model linking the renderer’s output and the data, and latent variables that adjust the fidelity of the renderer and the tolerance of the likelihood. Representations and algorithms from computer graphics are used as the deterministic backbone for highly approximate and stochastic generative models. This formulation combines probabilistic programming, computer graphics, and approximate Bayesian computation, and depends only on generalpurpose, automatic inference techniques. We describe two applications: reading sequences of degraded and adversarially obscured characters, and inferring 3D road models from vehicle-mounted camera images. Each of the probabilistic graphics programs we present relies on under 20 lines of probabilistic code, and yields accurate, approximately Bayesian inferences about real-world images. 1
3 0.67880166 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning
Author: Chen-Ping Yu, Wen-Yu Hua, Dimitris Samaras, Greg Zelinsky
Abstract: Visual clutter, the perception of an image as being crowded and disordered, affects aspects of our lives ranging from object detection to aesthetics, yet relatively little effort has been made to model this important and ubiquitous percept. Our approach models clutter as the number of proto-objects segmented from an image, with proto-objects defined as groupings of superpixels that are similar in intensity, color, and gradient orientation features. We introduce a novel parametric method of clustering superpixels by modeling mixture of Weibulls on Earth Mover’s Distance statistics, then taking the normalized number of proto-objects following partitioning as our estimate of clutter perception. We validated this model using a new 90-image dataset of real world scenes rank ordered by human raters for clutter, and showed that our method not only predicted clutter extremely well (Spearman’s ρ = 0.8038, p < 0.001), but also outperformed all existing clutter perception models and even a behavioral object segmentation ground truth. We conclude that the number of proto-objects in an image affects clutter perception more than the number of objects or features. 1
4 0.66078866 114 nips-2013-Extracting regions of interest from biological images with convolutional sparse block coding
Author: Marius Pachitariu, Adam M. Packer, Noah Pettit, Henry Dalgleish, Michael Hausser, Maneesh Sahani
Abstract: Biological tissue is often composed of cells with similar morphologies replicated throughout large volumes and many biological applications rely on the accurate identification of these cells and their locations from image data. Here we develop a generative model that captures the regularities present in images composed of repeating elements of a few different types. Formally, the model can be described as convolutional sparse block coding. For inference we use a variant of convolutional matching pursuit adapted to block-based representations. We extend the KSVD learning algorithm to subspaces by retaining several principal vectors from the SVD decomposition instead of just one. Good models with little cross-talk between subspaces can be obtained by learning the blocks incrementally. We perform extensive experiments on simulated images and the inference algorithm consistently recovers a large proportion of the cells with a small number of false positives. We fit the convolutional model to noisy GCaMP6 two-photon images of spiking neurons and to Nissl-stained slices of cortical tissue and show that it recovers cell body locations without supervision. The flexibility of the block-based representation is reflected in the variability of the recovered cell shapes. 1
5 0.62924612 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
Author: Carl Doersch, Abhinav Gupta, Alexei A. Efros
Abstract: Recent work on mid-level visual representations aims to capture information at the level of complexity higher than typical “visual words”, but lower than full-blown semantic objects. Several approaches [5, 6, 12, 23] have been proposed to discover mid-level visual elements, that are both 1) representative, i.e., frequently occurring within a visual dataset, and 2) visually discriminative. However, the current approaches are rather ad hoc and difficult to analyze and evaluate. In this work, we pose visual element discovery as discriminative mode seeking, drawing connections to the the well-known and well-studied mean-shift algorithm [2, 1, 4, 8]. Given a weakly-labeled image collection, our method discovers visually-coherent patch clusters that are maximally discriminative with respect to the labels. One advantage of our formulation is that it requires only a single pass through the data. We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset of [5]. We also evaluate our method on the task of scene classification, demonstrating state-of-the-art performance on the MIT Scene-67 dataset. 1
6 0.61207211 321 nips-2013-Supervised Sparse Analysis and Synthesis Operators
8 0.5865258 167 nips-2013-Learning the Local Statistics of Optical Flow
9 0.56364399 226 nips-2013-One-shot learning by inverting a compositional causal process
10 0.55265796 166 nips-2013-Learning invariant representations and applications to face verification
11 0.53869534 119 nips-2013-Fast Template Evaluation with Vector Quantization
12 0.52954292 138 nips-2013-Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation
13 0.51182008 163 nips-2013-Learning a Deep Compact Image Representation for Visual Tracking
14 0.51009625 329 nips-2013-Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections
15 0.49860904 170 nips-2013-Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space
16 0.49770665 31 nips-2013-Adaptivity to Local Smoothness and Dimension in Kernel Regression
17 0.4972069 350 nips-2013-Wavelets on Graphs via Deep Learning
18 0.49472612 256 nips-2013-Probabilistic Principal Geodesic Analysis
19 0.48869246 153 nips-2013-Learning Feature Selection Dependencies in Multi-task Learning
20 0.48732778 214 nips-2013-On Algorithms for Sparse Multi-factor NMF
topicId topicWeight
[(16, 0.027), (33, 0.584), (34, 0.079), (41, 0.017), (49, 0.022), (56, 0.081), (70, 0.022), (85, 0.031), (89, 0.018), (93, 0.037), (95, 0.012)]
simIndex simValue paperId paperTitle
1 0.99748629 88 nips-2013-Designed Measurements for Vector Count Data
Author: Liming Wang, David Carlson, Miguel Rodrigues, David Wilcox, Robert Calderbank, Lawrence Carin
Abstract: We consider design of linear projection measurements for a vector Poisson signal model. The projections are performed on the vector Poisson rate, X ∈ Rn , and the + observed data are a vector of counts, Y ∈ Zm . The projection matrix is designed + by maximizing mutual information between Y and X, I(Y ; X). When there is a latent class label C ∈ {1, . . . , L} associated with X, we consider the mutual information with respect to Y and C, I(Y ; C). New analytic expressions for the gradient of I(Y ; X) and I(Y ; C) are presented, with gradient performed with respect to the measurement matrix. Connections are made to the more widely studied Gaussian measurement model. Example results are presented for compressive topic modeling of a document corpora (word counting), and hyperspectral compressive sensing for chemical classification (photon counting). 1
2 0.99537235 81 nips-2013-DeViSE: A Deep Visual-Semantic Embedding Model
Author: Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Marc'Aurelio Ranzato, Tomas Mikolov
Abstract: Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows. One remedy is to leverage data from other sources – such as text data – both to train visual models and to constrain their predictions. In this paper we present a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text. We demonstrate that this model matches state-of-the-art performance on the 1000-class ImageNet object recognition challenge while making more semantically reasonable errors, and also show that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training. Semantic knowledge improves such zero-shot predictions achieving hit rates of up to 18% across thousands of novel labels never seen by the visual model. 1
3 0.99522758 217 nips-2013-On Poisson Graphical Models
Author: Eunho Yang, Pradeep Ravikumar, Genevera I. Allen, Zhandong Liu
Abstract: Undirected graphical models, such as Gaussian graphical models, Ising, and multinomial/categorical graphical models, are widely used in a variety of applications for modeling distributions over a large number of variables. These standard instances, however, are ill-suited to modeling count data, which are increasingly ubiquitous in big-data settings such as genomic sequencing data, user-ratings data, spatial incidence data, climate studies, and site visits. Existing classes of Poisson graphical models, which arise as the joint distributions that correspond to Poisson distributed node-conditional distributions, have a major drawback: they can only model negative conditional dependencies for reasons of normalizability given its infinite domain. In this paper, our objective is to modify the Poisson graphical model distribution so that it can capture a rich dependence structure between count-valued variables. We begin by discussing two strategies for truncating the Poisson distribution and show that only one of these leads to a valid joint distribution. While this model can accommodate a wider range of conditional dependencies, some limitations still remain. To address this, we investigate two additional novel variants of the Poisson distribution and their corresponding joint graphical model distributions. Our three novel approaches provide classes of Poisson-like graphical models that can capture both positive and negative conditional dependencies between count-valued variables. One can learn the graph structure of our models via penalized neighborhood selection, and we demonstrate the performance of our methods by learning simulated networks as well as a network from microRNA-sequencing data. 1
4 0.99247879 306 nips-2013-Speeding up Permutation Testing in Neuroimaging
Author: Chris Hinrichs, Vamsi Ithapu, Qinyuan Sun, Sterling C. Johnson, Vikas Singh
Abstract: Multiple hypothesis testing is a significant problem in nearly all neuroimaging studies. In order to correct for this phenomena, we require a reliable estimate of the Family-Wise Error Rate (FWER). The well known Bonferroni correction method, while simple to implement, is quite conservative, and can substantially under-power a study because it ignores dependencies between test statistics. Permutation testing, on the other hand, is an exact, non-parametric method of estimating the FWER for a given α-threshold, but for acceptably low thresholds the computational burden can be prohibitive. In this paper, we show that permutation testing in fact amounts to populating the columns of a very large matrix P. By analyzing the spectrum of this matrix, under certain conditions, we see that P has a low-rank plus a low-variance residual decomposition which makes it suitable for highly sub–sampled — on the order of 0.5% — matrix completion methods. Based on this observation, we propose a novel permutation testing methodology which offers a large speedup, without sacrificing the fidelity of the estimated FWER. Our evaluations on four different neuroimaging datasets show that a computational speedup factor of roughly 50× can be achieved while recovering the FWER distribution up to very high accuracy. Further, we show that the estimated α-threshold is also recovered faithfully, and is stable. 1
5 0.98845536 160 nips-2013-Learning Stochastic Feedforward Neural Networks
Author: Yichuan Tang, Ruslan Salakhutdinov
Abstract: Multilayer perceptrons (MLPs) or neural networks are popular models used for nonlinear regression and classification tasks. As regressors, MLPs model the conditional distribution of the predictor variables Y given the input variables X. However, this predictive distribution is assumed to be unimodal (e.g. Gaussian). For tasks involving structured prediction, the conditional distribution should be multi-modal, resulting in one-to-many mappings. By using stochastic hidden variables rather than deterministic ones, Sigmoid Belief Nets (SBNs) can induce a rich multimodal distribution in the output space. However, previously proposed learning algorithms for SBNs are not efficient and unsuitable for modeling real-valued data. In this paper, we propose a stochastic feedforward network with hidden layers composed of both deterministic and stochastic variables. A new Generalized EM training procedure using importance sampling allows us to efficiently learn complicated conditional distributions. Our model achieves superior performance on synthetic and facial expressions datasets compared to conditional Restricted Boltzmann Machines and Mixture Density Networks. In addition, the latent features of our model improves classification and can learn to generate colorful textures of objects. 1
6 0.98236424 46 nips-2013-Bayesian Estimation of Latently-grouped Parameters in Undirected Graphical Models
same-paper 7 0.98014611 212 nips-2013-Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty
8 0.97107434 253 nips-2013-Prior-free and prior-dependent regret bounds for Thompson Sampling
9 0.96768957 222 nips-2013-On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization
10 0.9460572 356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer
11 0.9322297 93 nips-2013-Discriminative Transfer Learning with Tree-based Priors
12 0.91224408 200 nips-2013-Multi-Prediction Deep Boltzmann Machines
13 0.90913558 254 nips-2013-Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
14 0.90740269 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
15 0.90658182 334 nips-2013-Training and Analysing Deep Recurrent Neural Networks
16 0.90443754 67 nips-2013-Conditional Random Fields via Univariate Exponential Families
17 0.90442079 260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
18 0.90080017 315 nips-2013-Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs
19 0.89850003 226 nips-2013-One-shot learning by inverting a compositional causal process
20 0.89784724 335 nips-2013-Transfer Learning in a Transductive Setting