nips nips2008 nips2008-192 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Siwei Lyu, Eero P. Simoncelli
Abstract: We consider the problem of transforming a signal to a representation in which the components are statistically independent. When the signal is generated as a linear transformation of independent Gaussian or non-Gaussian sources, the solution may be computed using a linear transformation (PCA or ICA, respectively). Here, we consider a complementary case, in which the source is non-Gaussian but elliptically symmetric. Such a source cannot be decomposed into independent components using a linear transform, but we show that a simple nonlinear transformation, which we call radial Gaussianization (RG), is able to remove all dependencies. We apply this methodology to natural signals, demonstrating that the joint distributions of nearby bandpass filter responses, for both sounds and images, are closer to being elliptically symmetric than linearly transformed factorial sources. Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either pairs or blocks of bandpass filter responses is significantly greater than that achieved by PCA or ICA.
Reference: text
sentIndex sentText sentNum sentScore
1 Reducing statistical dependencies in natural signals using radial Gaussianization Siwei Lyu Computer Science Department University at Albany, SUNY Albany, NY 12222 lsw@cs. [sent-1, score-0.425]
2 When the signal is generated as a linear transformation of independent Gaussian or non-Gaussian sources, the solution may be computed using a linear transformation (PCA or ICA, respectively). [sent-7, score-0.242]
3 Here, we consider a complementary case, in which the source is non-Gaussian but elliptically symmetric. [sent-8, score-0.31]
4 Such a source cannot be decomposed into independent components using a linear transform, but we show that a simple nonlinear transformation, which we call radial Gaussianization (RG), is able to remove all dependencies. [sent-9, score-0.347]
5 We apply this methodology to natural signals, demonstrating that the joint distributions of nearby bandpass filter responses, for both sounds and images, are closer to being elliptically symmetric than linearly transformed factorial sources. [sent-10, score-1.129]
6 Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either pairs or blocks of bandpass filter responses is significantly greater than that achieved by PCA or ICA. [sent-11, score-0.699]
7 In the context of biological sensory systems, the efficient coding hypothesis [1, 2] proposes that the principle of reducing redundancies in natural signals can be used to explain various properties of biological perceptual systems. [sent-13, score-0.301]
8 Given a source model, the problem of deriving an appropriate transformation to remove statistical dependencies, based on the statistics of observed samples, has been studied for more than a century. [sent-14, score-0.171]
9 The most well-known example is principal components analysis (PCA), a linear transformation derived from the second-order signal statistics (i. [sent-15, score-0.169]
10 Over the past two decades, a more general method, known as independent component analysis (ICA), has been developed to handle the case when the signal is sampled from a linearly transformed factorial source. [sent-18, score-0.366]
11 ICA and related methods have shown success in many applications, especially in deriving optimal representations for natural signals [3, 4, 5, 6]. [sent-19, score-0.177]
12 Although PCA and ICA bases may be computed for nearly any source, they are only guaranteed to eliminate dependencies when the assumed source model is correct. [sent-20, score-0.223]
13 Although dependency between the responses of such linear basis functions is reduced compared to that of the original pixels, this reduc1 Linearly transformed factorial Elliptical Factorial Gaussian Spherical Fig. [sent-23, score-0.447]
14 The two circles represent the linearly transformed factorial densities as assumed by the ICA methods, and elliptically symmetric densities (ESDs). [sent-26, score-0.964]
15 The factorial densities form a subset of the linearly transformed factorial densities and the spherically symmetric densities form a subset of the ESDs. [sent-28, score-1.122]
16 tion is only slightly more than that achieved with PCA or other bandpass filters [7, 8]. [sent-29, score-0.275]
17 Furthermore, the responses of ICA and related filters still exhibit striking higher-order dependencies [9, 10, 11]. [sent-30, score-0.206]
18 Here, we consider the dependency elimination problem for the class of source models known as elliptically symmetric densities (ESDs) [12]. [sent-31, score-0.698]
19 We introduce an alternative nonlinear procedure, which we call radial Gaussianization (RG). [sent-33, score-0.203]
20 In RG, the norms of whitened signal vectors are nonlinearly adjusted to ensure that the resulting output density is a spherical Gaussian, whose components are statistically independent. [sent-34, score-0.296]
21 We first show that the joint statistics of proximal bandpass filter responses for natural signals (sounds and images) are better described as an ESD than linearly transformed factorial sources. [sent-35, score-0.874]
22 Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to such data is significantly greater than that achieved by PCA or ICA. [sent-36, score-0.265]
23 In the special case when Σ is a multiple of the identity matrix, the level sets of p(x) are hyper-spheres and the density is known as a spherically symmetric density (SSD). [sent-40, score-0.354]
24 In fact, the Gaussian is the only density that is both elliptically symmetric and linearly decomposable into independent components [14]. [sent-44, score-0.51]
25 As a special case, a spherical Gaussian is the only spherically symmetric density that is also factorial (i. [sent-46, score-0.569]
26 Apart from the special case of Gaussian densities, a linear transformation such as PCA or ICA cannot completely eliminate dependencies in the ESDs. [sent-51, score-0.222]
27 In particular, PCA and whitening can transform an ESD variable to a spherically symmetric variable, xwht , but the resulting density will not be factorial unless it is Gaussian. [sent-52, score-0.704]
28 (a,e): 2D joint densities of a spherical Gaussian and a non-Gaussian SSD, respectively. [sent-58, score-0.35]
29 (b,f): radial marginal densities of the spherical Gaussian in (a) and the SSD in (e), respectively. [sent-60, score-0.479]
30 (c): the nonlinear mapping that transforms the radii of the source to those of the spherical Gaussian. [sent-62, score-0.321]
31 (d): log marginal densities of the Gaussian in (a) and the SSD in (e), as red dashed line and green solid line, respectively. [sent-63, score-0.197]
32 matrix) to transform xwht to a new set of coordinates maximizing a higher-order contrast function (e. [sent-64, score-0.221]
33 However, for spherically symmetric xwht , p(xwht ) is invariant to rotation, and thus unaffected by orthogonal transformations. [sent-67, score-0.385]
34 3 Radial Gaussianization Given that linear transforms are ineffective in removing dependencies from a spherically symmetric variable xwht (and hence the original ESD variable x), we need to consider non-linear mappings. [sent-68, score-0.554]
35 Thus, a natural solution for eliminating the dependencies in a non-Gaussian spherically symmetric xwht is to transform it to a spherical Gaussian. [sent-70, score-0.728]
36 It is natural to restrict to nonlinear mappings that act radially, preserving the spherical symmetry. [sent-72, score-0.26]
37 Specifically, one can show that the generating function of p(xwht ) is completely determined d−1 by its radial marginal distribution: pr (r) = r β f (−r2 /2), where r = xwht , Γ(·) is the standard Gamma function, and β is the normalizing constant that ensures that the density integrates to one. [sent-73, score-0.468]
38 In the special case of a spherical Gaussian of unit variance, the radial marginal is a chi-density rd−1 with d degrees of freedom: pχ (r) = 2d/2−1 Γ(d/2) exp(−r2 /2). [sent-74, score-0.322]
39 We define the radial Gaussianization x (RG) transformation as xrg = g( xwht ) xwht , where nonlinear function g(·) is selected to map the wht radial marginal density of xwht to the chi-density. [sent-75, score-1.149]
40 In summary, a non-Gaussian ESD signal can be radially Gaussianized by first applying PCA and whitening operations to remove second-order dependency (yielding an SSD), followed by a nonlinear transformation that maps the radial marginal to a chi-density. [sent-83, score-0.599]
41 4 Application to Natural Signals An understanding of the statistical behaviors of source signals is beneficial for many problems in signal processing, and can also provide insights into the design and functionality of biological sensory systems. [sent-84, score-0.345]
42 Specifically, ICA methodologies have been used to derive linear representations for natural sound and image signals whose coefficients are maximally sparse or independent [3, 5, 6]. [sent-87, score-0.32]
43 These analyses generally produced basis sets containing bandpass filters resembling those used to model the early transformations of biological auditory and visual systems. [sent-88, score-0.312]
44 First, the responses of ICA or other bandpass filters exhibit striking dependencies, in which the variance of one filter response can be predicted from the amplitude of another nearby filter response [10, 15]. [sent-90, score-0.392]
45 This suggests that although the marginal density of the bandpass filter responses are heavy-tailed, their joint density is not consistent with the linearly transformed factorial source model assumed by ICA. [sent-91, score-0.95]
46 Furthermore, the marginal distributions of a wide variety of bandpass filters (even a “filter” with randomly selected zero-mean weights) are all highly kurtotic [7]. [sent-92, score-0.287]
47 This would not be expected for the ICA source model: projecting the local data onto a random direction should result in a density that becomes more Gaussian as the neighborhood size increases, in accordance with a generalized version of the central limit theorem [16]. [sent-93, score-0.162]
48 A recent quantitative study [8] further showed that the oriented bandpass filters obtained through ICA optimization on images lead to a surprisingly small improvement in reducing dependency relative to decorrelation methods such as PCA. [sent-94, score-0.477]
49 Consistent with this, recently developed models for local image statistics model local groups of image bandpass filter responses with non-Gaussian ESDs [e. [sent-96, score-0.387]
50 These all suggest that RG might provide an appropriate means of eliminating dependencies in natural signals. [sent-99, score-0.169]
51 We used sound clips from commercial CDs, which have a sampling frequency of 44100 Hz and typical length of 15 − 20 seconds, and contents including animal vocalization and recordings in natural environments. [sent-103, score-0.162]
52 These sound clips were filtered with a bandpass gammatone filter, which are commonly used to model the peripheral auditory system [21]. [sent-104, score-0.342]
53 3 are contour plots of the joint histograms obtained from pairs of coefficients of a bandpass-filtered natural sound, separated with different time intervals. [sent-107, score-0.28]
54 Similar to the empirical observations for natural images [17, 11], the joint densities are nonGaussian, and have roughly elliptically symmetric contours for temporally proximal pairs. [sent-108, score-0.742]
55 The “bow-tie” shaped conditional distribution, which has been also observed in natural images [10, 11, 15], indicates that the conditional variance of one signal depends on the value of the other. [sent-111, score-0.249]
56 For pairs that are distant, both the second-order correlation and the higher-order dependency become weaker. [sent-113, score-0.178]
57 As a result, the corresponding joint histograms show more resemblance to the factorial product of two one-dimensional super-Gaussian densities (bottom row of column (a) in Fig. [sent-114, score-0.516]
58 3 are the joint and conditional histograms of the transformed data. [sent-118, score-0.257]
59 First, note that when the two signals are nearby, RG is highly effective, as suggested by the roughly Gaussian joint density (equally spaced circular contours), and by the consistent vertical cross-sections of the conditional histogram. [sent-119, score-0.269]
60 3), they are nearly independent, and applying RG can actually increase dependency, as suggested by the irregular shape of the conditional densities (bottom row, column (d)). [sent-123, score-0.238]
61 (a): Contour plots of joint histograms of pairs of band-pass filter responses of a natural sound clip. [sent-130, score-0.416]
62 Each row corresponds to pairs with different temporal separation, and levels are chosen so that a spherical Gaussian density will have equally spaced contours. [sent-131, score-0.305]
63 (b,d): Conditional histograms of the same data shown in (a,c), computed by independently normalizing each column of the joint histogram. [sent-133, score-0.19]
64 To compare the effect of different dependency reduction methods, we estimated the MI of pairs of bandpass filter responses with different temporal separations. [sent-140, score-0.563]
65 We computed the MI for each pair of raw signals, as well as pairs of the PCA, ICA and RG transformed signals. [sent-145, score-0.178]
66 In contrast, the nonlinear RG transformation achieves an impressive reduction (nearly 100%) in MI for pairs separated by less than 0. [sent-151, score-0.262]
67 This can be understood by considering the joint and conditional histograms in Fig. [sent-153, score-0.161]
68 Since the joint density of nearby pairs is approximately elliptically symmetric, ICA cannot provide much improvement beyond what is obtained with PCA, while RG is expected to perform well. [sent-155, score-0.454]
69 On the other hand, the joint densities of more distant pairs (beyond 2. [sent-156, score-0.299]
70 This is a direct result of the fact that the data do not adhere to the elliptically symmetric source model assumptions underlying the RG procedure. [sent-162, score-0.419]
71 2 to 2 msec), there is a transition of the joint densities from elliptically symmetric to factorial (second row in Fig. [sent-164, score-0.739]
72 Left: Multi-information (in bits/coefficient) for pairs of bandpass filter responses of natural audio signals, as a function of temporal separation. [sent-183, score-0.444]
73 Shown are the MI of the raw filter response pairs, as well as the MI of the pairs transformed with PCA, ICA, and RG. [sent-184, score-0.178]
74 Right: Same analysis for pairs of bandpass filter responses averaged over 8 natural images. [sent-186, score-0.444]
75 4) when analyzing pairs of bandpass filter responses of natural images using the data sets described in the next section. [sent-229, score-0.525]
76 2 Dependency Reduction in Natural Images We have also examined the ability of RG to reduce dependencies of image pixel blocks with local mean removed. [sent-231, score-0.192]
77 We examined eight images of natural woodland scenes from the van Hateren database [26]. [sent-232, score-0.181]
78 We extracted the central 1024 × 1024 region from each, computed the log of the intensity values, and then subtracted the local mean [8] by convolving with an isotropic bandpass filter that captures an annulus of frequencies in the Fourier domain ranging from π/4 to π radians/pixel. [sent-233, score-0.247]
79 We denote blocks taken from these bandpass filtered images as xraw . [sent-234, score-0.468]
80 These blocks were then transformed with PCA (denoted xpca ), ICA (denoted xica ) and RG (denoted xrg ). [sent-235, score-0.236]
81 We would like to compare the dependency reduction performance of each of these methods using multi-information. [sent-238, score-0.186]
82 Instead, as in [8], we can avoid direct estimation of MI by evaluating and comparing the differences in MI of the various transformed blocks relative to xraw . [sent-240, score-0.236]
83 Specifically, we use ∆I pca = I(xraw ) − I(x pca ) as a reference value, and compare this with ∆Iica = I(xraw ) − I(xica) and ∆Irg = I(xraw ) − I(xrg ). [sent-241, score-0.322]
84 5 are scatter plots of ∆I pca versus ∆Iica (red circles) and ∆Irg (blue pluses) for various block sizes. [sent-244, score-0.183]
85 These results may be attributed to the fact that the joint density for local pixel blocks tend to be close to be elliptically symmetric [17, 11]. [sent-248, score-0.523]
86 5 Conclusion We have introduced a new signal transformation known as radial Gaussianization (RG), which can eliminate dependencies of sources with elliptically symmetric densities. [sent-249, score-0.75]
87 Empirically, we have shown that RG transform is highly effective at removing dependencies between pairs of samples in bandpass filtered sounds and images, and within local blocks of bandpass filtered images. [sent-250, score-0.834]
88 One important issue underlying our development of this methodology is the intimate relation between source models and dependency reduction methods. [sent-251, score-0.3]
89 The class of elliptically symmetric densities represents a generalization of the Gaussian family that is complementary to the class of linearly transformed factorial densities (see Fig. [sent-252, score-0.941]
90 The three dependency reduction methods we have discussed (PCA, ICA and RG) are each associated with one of these classes, and are each guaranteed to produce independent responses when applied to signals drawn from a density belonging to the corresponding class. [sent-254, score-0.469]
91 But applying one of these methods to a signal with an incompatible source model may not achieve the expected reduction in dependency (e. [sent-255, score-0.344]
92 An iterative Gaussianization scheme transforms any source model to a spherical Gaussian by alternating between linear ICA transformations and nonlinear histogram matching to map marginal densities to Gaussians [28]. [sent-261, score-0.592]
93 However, in general, the overall transformation of iterative Gaussianization is an alternating concatenation of many linear/nonlinear transformations, and results in a substantial distortion of the original source space. [sent-262, score-0.171]
94 Another nonlinear transform that has also been shown to be able to reduce higher-order dependencies in natural signals is divisive normalization [15]. [sent-264, score-0.408]
95 In the extended version of this paper [13], we show that there is no ESD source model for whose dependencies can be completely eliminated by divisive normalization. [sent-265, score-0.222]
96 On the other hand, divisive normalization provides a rough approximation to RG, which suggests that RG might provide a more principled justification for normalization-like nonlinear behaviors seen in biological sensory systems. [sent-266, score-0.191]
97 Second, the RG methodology provides a solution to the efficient coding problem for ESD signals in the noise-free case, and it is worthwhile to consider how the solution would be affected by the presence of sensor and/or channel noise. [sent-269, score-0.164]
98 Third, we have shown that RG substantially reduces dependency for nearby samples of bandpass filtered image/sound, but that performance worsens as the coefficients become more separated, where their joint densities are closer to factorial. [sent-270, score-0.624]
99 Recent models of natural images [29, 30] have used Markov random fields based on local elliptically symmetric models, and these are seen to provide a natural transition of pairwise joint densities from elliptically symmetric to factorial. [sent-271, score-1.095]
100 Nonlinear extraction of “independent components” of elliptically symmetric densities using radial Gaussianization. [sent-340, score-0.636]
wordName wordTfidf (topN-words)
[('rg', 0.486), ('ica', 0.32), ('bandpass', 0.247), ('elliptically', 0.224), ('esd', 0.183), ('xwht', 0.183), ('pca', 0.161), ('gaussianization', 0.16), ('densities', 0.157), ('factorial', 0.155), ('radial', 0.146), ('mi', 0.139), ('spherical', 0.136), ('dependency', 0.122), ('ssd', 0.116), ('signals', 0.11), ('symmetric', 0.109), ('lter', 0.102), ('dependencies', 0.102), ('esds', 0.1), ('transformed', 0.096), ('spherically', 0.093), ('source', 0.086), ('transformation', 0.085), ('xraw', 0.083), ('images', 0.081), ('ective', 0.08), ('msec', 0.08), ('histograms', 0.078), ('density', 0.076), ('ltered', 0.075), ('responses', 0.074), ('lters', 0.067), ('natural', 0.067), ('reduction', 0.064), ('sound', 0.062), ('sounds', 0.062), ('nonlinear', 0.057), ('joint', 0.057), ('blocks', 0.057), ('pairs', 0.056), ('erent', 0.052), ('sensory', 0.05), ('blk', 0.05), ('irg', 0.05), ('xrg', 0.05), ('whitening', 0.05), ('di', 0.049), ('signal', 0.049), ('linearly', 0.043), ('transforms', 0.042), ('nearby', 0.041), ('transformations', 0.041), ('iica', 0.04), ('marginal', 0.04), ('separation', 0.038), ('transform', 0.038), ('gaussian', 0.037), ('coe', 0.037), ('lyu', 0.037), ('row', 0.037), ('components', 0.035), ('eliminate', 0.035), ('divisive', 0.034), ('albany', 0.033), ('clips', 0.033), ('venn', 0.033), ('xica', 0.033), ('histogram', 0.033), ('van', 0.033), ('image', 0.033), ('column', 0.032), ('striking', 0.03), ('pluses', 0.029), ('separations', 0.029), ('distant', 0.029), ('methodology', 0.028), ('achieved', 0.028), ('volume', 0.027), ('decorrelation', 0.027), ('radially', 0.027), ('radical', 0.027), ('behaviors', 0.026), ('conditional', 0.026), ('raw', 0.026), ('coding', 0.026), ('vision', 0.025), ('receptive', 0.025), ('removing', 0.025), ('eero', 0.025), ('methodologies', 0.025), ('hateren', 0.025), ('proximal', 0.025), ('biological', 0.024), ('independent', 0.023), ('circles', 0.023), ('applying', 0.023), ('normalizing', 0.023), ('plots', 0.022), ('contours', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999923 192 nips-2008-Reducing statistical dependencies in natural signals using radial Gaussianization
Author: Siwei Lyu, Eero P. Simoncelli
Abstract: We consider the problem of transforming a signal to a representation in which the components are statistically independent. When the signal is generated as a linear transformation of independent Gaussian or non-Gaussian sources, the solution may be computed using a linear transformation (PCA or ICA, respectively). Here, we consider a complementary case, in which the source is non-Gaussian but elliptically symmetric. Such a source cannot be decomposed into independent components using a linear transform, but we show that a simple nonlinear transformation, which we call radial Gaussianization (RG), is able to remove all dependencies. We apply this methodology to natural signals, demonstrating that the joint distributions of nearby bandpass filter responses, for both sounds and images, are closer to being elliptically symmetric than linearly transformed factorial sources. Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either pairs or blocks of bandpass filter responses is significantly greater than that achieved by PCA or ICA.
2 0.35205173 232 nips-2008-The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction
Author: Fabian H. Sinz, Matthias Bethge
Abstract: Bandpass filtering, orientation selectivity, and contrast gain control are prominent features of sensory coding at the level of V1 simple cells. While the effect of bandpass filtering and orientation selectivity can be assessed within a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of Lp elliptically contoured distributions to investigate the extent to which the two features—orientation selectivity and contrast gain control—are suited to model the statistics of natural images. Within this framework we find that contrast gain control can play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for redundancy reduction. 1
3 0.13780133 102 nips-2008-ICA based on a Smooth Estimation of the Differential Entropy
Author: Lev Faivishevsky, Jacob Goldberger
Abstract: In this paper we introduce the MeanNN approach for estimation of main information theoretic measures such as differential entropy, mutual information and divergence. As opposed to other nonparametric approaches the MeanNN results in smooth differentiable functions of the data samples with clear geometrical interpretation. Then we apply the proposed estimators to the ICA problem and obtain a smooth expression for the mutual information that can be analytically optimized by gradient descent methods. The improved performance of the proposed ICA algorithm is demonstrated on several test examples in comparison with state-ofthe-art techniques. 1
4 0.13215248 234 nips-2008-The Infinite Factorial Hidden Markov Model
Author: Jurgen V. Gael, Yee W. Teh, Zoubin Ghahramani
Abstract: We introduce a new probability distribution over a potentially infinite number of binary Markov chains which we call the Markov Indian buffet process. This process extends the IBP to allow temporal dependencies in the hidden variables. We use this stochastic process to build a nonparametric extension of the factorial hidden Markov model. After constructing an inference scheme which combines slice sampling and dynamic programming we demonstrate how the infinite factorial hidden Markov model can be used for blind source separation. 1
5 0.096816115 171 nips-2008-Online Prediction on Large Diameter Graphs
Author: Mark Herbster, Guy Lever, Massimiliano Pontil
Abstract: We continue our study of online prediction of the labelling of a graph. We show a fundamental limitation of Laplacian-based algorithms: if the graph has a large diameter then the number of mistakes made by such algorithms may be proportional to the square root of the number of vertices, even when tackling simple problems. We overcome this drawback by means of an efficient algorithm which achieves a logarithmic mistake bound. It is based on the notion of a spine, a path graph which provides a linear embedding of the original graph. In practice, graphs may exhibit cluster structure; thus in the last part, we present a modified algorithm which achieves the “best of both worlds”: it performs well locally in the presence of cluster structure, and globally on large diameter graphs. 1
7 0.080106676 84 nips-2008-Fast Prediction on a Tree
8 0.078372426 118 nips-2008-Learning Transformational Invariants from Natural Movies
9 0.073119655 227 nips-2008-Supervised Exponential Family Principal Component Analysis via Convex Optimization
10 0.062261425 24 nips-2008-An improved estimator of Variance Explained in the presence of noise
11 0.059703127 200 nips-2008-Robust Kernel Principal Component Analysis
12 0.059300762 45 nips-2008-Characterizing neural dependencies with copula models
13 0.058549237 243 nips-2008-Understanding Brain Connectivity Patterns during Motor Imagery for Brain-Computer Interfacing
14 0.055169985 31 nips-2008-Bayesian Exponential Family PCA
15 0.054956526 95 nips-2008-Grouping Contours Via a Related Image
16 0.054733127 78 nips-2008-Exact Convex Confidence-Weighted Learning
17 0.053402137 112 nips-2008-Kernel Measures of Independence for non-iid Data
18 0.05334321 75 nips-2008-Estimating vector fields using sparse basis field expansions
19 0.052360523 74 nips-2008-Estimating the Location and Orientation of Complex, Correlated Neural Activity using MEG
20 0.048971955 16 nips-2008-Adaptive Template Matching with Shift-Invariant Semi-NMF
topicId topicWeight
[(0, -0.17), (1, -0.022), (2, 0.105), (3, 0.038), (4, 0.038), (5, 0.02), (6, -0.045), (7, 0.026), (8, 0.089), (9, -0.021), (10, 0.011), (11, -0.036), (12, 0.065), (13, 0.07), (14, -0.209), (15, -0.225), (16, 0.241), (17, 0.252), (18, -0.032), (19, 0.185), (20, -0.065), (21, 0.101), (22, 0.013), (23, 0.084), (24, -0.138), (25, -0.151), (26, -0.032), (27, -0.019), (28, -0.164), (29, -0.082), (30, 0.104), (31, 0.105), (32, -0.078), (33, 0.008), (34, 0.004), (35, -0.123), (36, -0.027), (37, -0.039), (38, 0.024), (39, 0.123), (40, 0.037), (41, 0.047), (42, 0.007), (43, 0.044), (44, 0.074), (45, 0.034), (46, 0.102), (47, -0.111), (48, 0.057), (49, 0.143)]
simIndex simValue paperId paperTitle
same-paper 1 0.96812445 192 nips-2008-Reducing statistical dependencies in natural signals using radial Gaussianization
Author: Siwei Lyu, Eero P. Simoncelli
Abstract: We consider the problem of transforming a signal to a representation in which the components are statistically independent. When the signal is generated as a linear transformation of independent Gaussian or non-Gaussian sources, the solution may be computed using a linear transformation (PCA or ICA, respectively). Here, we consider a complementary case, in which the source is non-Gaussian but elliptically symmetric. Such a source cannot be decomposed into independent components using a linear transform, but we show that a simple nonlinear transformation, which we call radial Gaussianization (RG), is able to remove all dependencies. We apply this methodology to natural signals, demonstrating that the joint distributions of nearby bandpass filter responses, for both sounds and images, are closer to being elliptically symmetric than linearly transformed factorial sources. Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either pairs or blocks of bandpass filter responses is significantly greater than that achieved by PCA or ICA.
2 0.90581769 232 nips-2008-The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction
Author: Fabian H. Sinz, Matthias Bethge
Abstract: Bandpass filtering, orientation selectivity, and contrast gain control are prominent features of sensory coding at the level of V1 simple cells. While the effect of bandpass filtering and orientation selectivity can be assessed within a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of Lp elliptically contoured distributions to investigate the extent to which the two features—orientation selectivity and contrast gain control—are suited to model the statistics of natural images. Within this framework we find that contrast gain control can play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for redundancy reduction. 1
3 0.48012608 234 nips-2008-The Infinite Factorial Hidden Markov Model
Author: Jurgen V. Gael, Yee W. Teh, Zoubin Ghahramani
Abstract: We introduce a new probability distribution over a potentially infinite number of binary Markov chains which we call the Markov Indian buffet process. This process extends the IBP to allow temporal dependencies in the hidden variables. We use this stochastic process to build a nonparametric extension of the factorial hidden Markov model. After constructing an inference scheme which combines slice sampling and dynamic programming we demonstrate how the infinite factorial hidden Markov model can be used for blind source separation. 1
4 0.47345629 102 nips-2008-ICA based on a Smooth Estimation of the Differential Entropy
Author: Lev Faivishevsky, Jacob Goldberger
Abstract: In this paper we introduce the MeanNN approach for estimation of main information theoretic measures such as differential entropy, mutual information and divergence. As opposed to other nonparametric approaches the MeanNN results in smooth differentiable functions of the data samples with clear geometrical interpretation. Then we apply the proposed estimators to the ICA problem and obtain a smooth expression for the mutual information that can be analytically optimized by gradient descent methods. The improved performance of the proposed ICA algorithm is demonstrated on several test examples in comparison with state-ofthe-art techniques. 1
5 0.35542914 126 nips-2008-Localized Sliced Inverse Regression
Author: Qiang Wu, Sayan Mukherjee, Feng Liang
Abstract: We developed localized sliced inverse regression for supervised dimension reduction. It has the advantages of preventing degeneracy, increasing estimation accuracy, and automatic subclass discovery in classification problems. A semisupervised version is proposed for the use of unlabeled data. The utility is illustrated on simulated as well as real data sets.
6 0.34658802 227 nips-2008-Supervised Exponential Family Principal Component Analysis via Convex Optimization
7 0.32579938 61 nips-2008-Diffeomorphic Dimensionality Reduction
8 0.3029927 31 nips-2008-Bayesian Exponential Family PCA
9 0.25514975 180 nips-2008-Playing Pinball with non-invasive BCI
10 0.25132009 45 nips-2008-Characterizing neural dependencies with copula models
11 0.23436224 90 nips-2008-Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity
12 0.23266415 118 nips-2008-Learning Transformational Invariants from Natural Movies
13 0.22028455 171 nips-2008-Online Prediction on Large Diameter Graphs
14 0.22002955 153 nips-2008-Nonlinear causal discovery with additive noise models
15 0.21575168 66 nips-2008-Dynamic visual attention: searching for coding length increments
16 0.2134874 200 nips-2008-Robust Kernel Principal Component Analysis
17 0.20937486 75 nips-2008-Estimating vector fields using sparse basis field expansions
18 0.20899636 148 nips-2008-Natural Image Denoising with Convolutional Networks
19 0.20881994 222 nips-2008-Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning
20 0.20860025 173 nips-2008-Optimization on a Budget: A Reinforcement Learning Approach
topicId topicWeight
[(6, 0.064), (7, 0.103), (12, 0.035), (15, 0.011), (28, 0.129), (37, 0.033), (57, 0.089), (59, 0.015), (63, 0.039), (71, 0.019), (77, 0.065), (78, 0.015), (83, 0.041), (88, 0.255)]
simIndex simValue paperId paperTitle
same-paper 1 0.7956354 192 nips-2008-Reducing statistical dependencies in natural signals using radial Gaussianization
Author: Siwei Lyu, Eero P. Simoncelli
Abstract: We consider the problem of transforming a signal to a representation in which the components are statistically independent. When the signal is generated as a linear transformation of independent Gaussian or non-Gaussian sources, the solution may be computed using a linear transformation (PCA or ICA, respectively). Here, we consider a complementary case, in which the source is non-Gaussian but elliptically symmetric. Such a source cannot be decomposed into independent components using a linear transform, but we show that a simple nonlinear transformation, which we call radial Gaussianization (RG), is able to remove all dependencies. We apply this methodology to natural signals, demonstrating that the joint distributions of nearby bandpass filter responses, for both sounds and images, are closer to being elliptically symmetric than linearly transformed factorial sources. Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either pairs or blocks of bandpass filter responses is significantly greater than that achieved by PCA or ICA.
2 0.69274831 50 nips-2008-Continuously-adaptive discretization for message-passing algorithms
Author: Michael Isard, John MacCormick, Kannan Achan
Abstract: Continuously-Adaptive Discretization for Message-Passing (CAD-MP) is a new message-passing algorithm for approximate inference. Most message-passing algorithms approximate continuous probability distributions using either: a family of continuous distributions such as the exponential family; a particle-set of discrete samples; or a fixed, uniform discretization. In contrast, CAD-MP uses a discretization that is (i) non-uniform, and (ii) adaptive to the structure of the marginal distributions. Non-uniformity allows CAD-MP to localize interesting features (such as sharp peaks) in the marginal belief distributions with time complexity that scales logarithmically with precision, as opposed to uniform discretization which scales at best linearly. We give a principled method for altering the non-uniform discretization according to information-based measures. CAD-MP is shown in experiments to estimate marginal beliefs much more precisely than competing approaches for the same computational expense. 1
3 0.61860055 232 nips-2008-The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction
Author: Fabian H. Sinz, Matthias Bethge
Abstract: Bandpass filtering, orientation selectivity, and contrast gain control are prominent features of sensory coding at the level of V1 simple cells. While the effect of bandpass filtering and orientation selectivity can be assessed within a linear model, contrast gain control is an inherently nonlinear computation. Here we employ the class of Lp elliptically contoured distributions to investigate the extent to which the two features—orientation selectivity and contrast gain control—are suited to model the statistics of natural images. Within this framework we find that contrast gain control can play a significant role for the removal of redundancies in natural images. Orientation selectivity, in contrast, has only a very limited potential for redundancy reduction. 1
4 0.61632431 62 nips-2008-Differentiable Sparse Coding
Author: J. A. Bagnell, David M. Bradley
Abstract: Prior work has shown that features which appear to be biologically plausible as well as empirically useful can be found by sparse coding with a prior such as a laplacian (L1 ) that promotes sparsity. We show how smoother priors can preserve the benefits of these sparse priors while adding stability to the Maximum A-Posteriori (MAP) estimate that makes it more useful for prediction problems. Additionally, we show how to calculate the derivative of the MAP estimate efficiently with implicit differentiation. One prior that can be differentiated this way is KL-regularization. We demonstrate its effectiveness on a wide variety of applications, and find that online optimization of the parameters of the KL-regularized model can significantly improve prediction performance. 1
5 0.61398846 216 nips-2008-Sparse probabilistic projections
Author: Cédric Archambeau, Francis R. Bach
Abstract: We present a generative model for performing sparse probabilistic projections, which includes sparse principal component analysis and sparse canonical correlation analysis as special cases. Sparsity is enforced by means of automatic relevance determination or by imposing appropriate prior distributions, such as generalised hyperbolic distributions. We derive a variational Expectation-Maximisation algorithm for the estimation of the hyperparameters and show that our novel probabilistic approach compares favourably to existing techniques. We illustrate how the proposed method can be applied in the context of cryptoanalysis as a preprocessing tool for the construction of template attacks. 1
6 0.61204469 200 nips-2008-Robust Kernel Principal Component Analysis
7 0.6065293 66 nips-2008-Dynamic visual attention: searching for coding length increments
8 0.60419881 63 nips-2008-Dimensionality Reduction for Data in Multiple Feature Representations
9 0.60358524 79 nips-2008-Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
10 0.60315484 182 nips-2008-Posterior Consistency of the Silverman g-prior in Bayesian Model Choice
11 0.60212058 71 nips-2008-Efficient Sampling for Gaussian Process Inference using Control Variables
12 0.60109746 75 nips-2008-Estimating vector fields using sparse basis field expansions
13 0.60031593 118 nips-2008-Learning Transformational Invariants from Natural Movies
14 0.59949034 27 nips-2008-Artificial Olfactory Brain for Mixture Identification
15 0.59912193 194 nips-2008-Regularized Learning with Networks of Features
16 0.59791183 208 nips-2008-Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes
17 0.59698009 205 nips-2008-Semi-supervised Learning with Weakly-Related Unlabeled Data : Towards Better Text Categorization
18 0.59651566 64 nips-2008-DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification
19 0.59502769 221 nips-2008-Stochastic Relational Models for Large-scale Dyadic Data using MCMC
20 0.59478021 197 nips-2008-Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation