nips nips2002 nips2002-10 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yan Karklin, Michael S. Lewicki
Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Computer Science Department & Center for the Neural Basis of Cognition Carnegie Mellon University Abstract We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. [sent-6, score-0.244]
2 The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. [sent-7, score-0.777]
3 This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture. [sent-8, score-1.068]
4 1 Introduction One of the major challenges in vision is how to derive from the retinal representation higher-order representations that describe properties of surfaces, objects, and scenes. [sent-9, score-0.198]
5 Physiological studies of the visual system have characterized a wide range of response properties, beginning with, for example, simple cells and complex cells. [sent-10, score-0.315]
6 These, however, offer only limited insight into how higher-order properties of images might be represented or even what the higher-order properties might be. [sent-11, score-0.376]
7 by inverting models of the physics of light propagation and surface reflectance properties to recover object and scene properties. [sent-14, score-0.162]
8 A more fundamental limitation, however, is that this formulation of the problem does not explain the adaptive nature of the visual system or how it can learn highly abstract and general representations of objects and surfaces. [sent-16, score-0.258]
9 An alternative approach is to derive representations from the statistics of the images themselves. [sent-17, score-0.319]
10 This information theoretic view, called efficient coding, starts with the observation that there is an equivalence between the degree of structure represented and the efficiency of the code [1]. [sent-18, score-0.172]
11 The hypothesis is that the primary goal of early sensory coding is to encode information efficiently. [sent-19, score-0.261]
12 This theory has been applied to derive efficient codes for ∗ To whom correspondence should be addressed natural images and to explain a wide range of response properties of neurons in the visual cortex [2–7]. [sent-20, score-0.7]
13 Most algorithms for learning efficient representations assume either simply that the data are generated by a linear superposition of basis functions, as in independent component analysis (ICA), or, as in sparse coding, that the basis function coefficients are ’sparsified’ by lateral inhibition. [sent-21, score-0.868]
14 Clearly, these simple models are insufficient to capture the rich structure of natural images, and although they capture higher-order statistics of natural images (correlations beyond second order), it remains unclear how to go beyond this to discover higher-order image structure. [sent-22, score-1.025]
15 One approach is to learn image classes by embedding the statistical density assumed by ICA in a mixture model [8]. [sent-23, score-0.438]
16 This provides a method for modeling classes of images and for performing automatic scene segmentation, but it assumes a fundamentally local representation and therefore is not suitable for compactly describing the large degree of structure variation across images. [sent-24, score-0.342]
17 With this, one is limited by the choice of the non-linearity and the range of image regularities that can be modeled. [sent-28, score-0.581]
18 In this paper, we take as a starting point the observation by Schwartz and Simoncelli [10] that, for natural images, there are significant statistical dependencies among the variances of filter outputs. [sent-29, score-0.437]
19 By factoring out these dependencies with divisive normalization, Schwartz and Simoncelli showed that the model could account for a wide range of non-linearities observed in neurons in the auditory nerve and primary visual cortex. [sent-30, score-0.351]
20 Here, we propose a statistical model for higher-order structure that learns a basis on the variance regularities in natural images. [sent-31, score-0.865]
21 This higher-order, non-orthogonal basis describes how, for a particular visual image patch, image basis function coefficient variances deviate from the default assumption of independence. [sent-32, score-1.631]
22 This view offers a novel description of higher-order image structure and provides a way to learn sparse distributed representations of abstract image properties such as object location, scale, and surface texture. [sent-33, score-1.067]
23 Efficient coding of natural images The computational goal of efficient coding is to derive from the statistics of the pattern ensemble a compact code that maximally reduces the redundancy in the patterns with minimal loss of information. [sent-34, score-0.883]
24 The standard model assumes that the data is generated using a set of basis functions A and coefficients u: x = Au , (1) Because coding efficiency is being optimized, it is necessary, either implicitly or explicitly, for the model to capture the probability distribution of the pattern ensemble. [sent-35, score-0.647]
25 (2) The coefficients ui , are assumed to be statistically independent p(u) = ∏ p(ui ) . [sent-37, score-0.204]
26 (3) i ICA learns efficient codes of natural scenes by adapting the basis vectors to maximize the likelihood of the ensemble of image patterns, p(x1 , . [sent-38, score-1.128]
27 , xN ) = ∏n p(xn |A), which maximizes the independence of the coefficients and optimizes coding efficiency within the limits of the linear model. [sent-41, score-0.254]
28 a b c Figure 1: Statistical dependencies among natural image independent component basis coefficients. [sent-42, score-1.036]
29 The scatter plots show for the two basis functions in the same row and column the joint distributions of basis function coefficients. [sent-43, score-0.831]
30 Each point represents the encoding of a 20 × 20 image patch centered at random locations in the image. [sent-44, score-0.473]
31 (a) For complex natural scenes, the joint distributions appear to be independent, because the joint distribution can be approximated by the product of the marginals. [sent-45, score-0.225]
32 (b) Closer inspection of particular image regions (the image in (b) is contained in the lower middle part of the image in (a)) reveals complex statistical dependencies for the same set of basis functions. [sent-46, score-1.481]
33 Statistical dependencies among ‘independent’ components A linear model can only achieve limited statistical independence among the basis function coefficients and thus can only capture a limited degree of visual structure. [sent-48, score-1.055]
34 Deviations from independence among the coefficients reflect particular kinds of visual structure (fig. [sent-49, score-0.396]
35 1a), but for particular images, the joint distribution of coefficients show complex statistical dependencies that reflect the higher-order structure (figs. [sent-53, score-0.263]
36 The challenge for developing more general models of efficient coding is formulating a description of these higher-order correlations in a way that captures meaningful higherorder visual structure. [sent-55, score-0.411]
37 2 Modeling higher-order statistical structure The basic model of standard efficient coding methods has two major limitations. [sent-56, score-0.288]
38 Second, the model can capture statistical relationships among the pixels, but does not provide any means to capture higher order relationships that cannot be simply described at the pixel level. [sent-58, score-0.375]
39 As a first step toward overcoming these limitations, we extend the basic model by introducing a non-independent prior to model higher-order statistical relationships among the basis function coefficients. [sent-59, score-0.508]
40 Given a representation of natural images in terms of a Gabor-wavelet-like representation learned by ICA, one salient statistical regularity is the covariation of basis function coefficients in different visual contexts. [sent-60, score-0.943]
41 Different types of image regions will exhibit different statistical regularities among the variances of the coefficients. [sent-64, score-0.756]
42 For a large ensemble of images, the goal is to find a code that describes these higher-order correlations efficiently. [sent-65, score-0.227]
43 In the standard efficient coding model, the coefficients are often assumed to follow a generalized Gaussian distribution q p(ui ) = ze−|ui /λi | , (4) where z = q/(2λi Γ[1/q]). [sent-66, score-0.203]
44 The exponent q determines the distribution’s shape and weight of the tails, and can be fixed or estimated from the data for each basis function coefficient. [sent-67, score-0.328]
45 The parameter λi determines the scale of variation (usually fixed in linear models, since basis vectors in A can absorb the scaling). [sent-68, score-0.371]
46 Because we want to capture regularities among the variance patterns of the coefficients, we do not want to model the values of u themselves. [sent-70, score-0.39]
47 Instead, we assume that the relative variances in different visual contexts can be modeled with a linear basis as follows λi = exp([Bv]i ) (5) λ ⇒ logλ = Bv . [sent-71, score-0.571]
48 This formulation is useful because it uses a basis to represent the deviation from the variance assumed by the standard model. [sent-73, score-0.409]
49 Because the distribution is sparse, only a few of the basis vectors in B are needed to describe how any particular image deviates from the default assumption of independence. [sent-77, score-0.774]
50 ˆ By maximizing the posterior p(v|u, B), the algorithm is computing the best way to describe how the distribution of vi ’s for the current image patch deviates from the default assumption of independence, i. [sent-82, score-0.662]
51 The basis functions in B represent an efficient, sparse, distributed code for commonly observed deviations. [sent-86, score-0.514]
52 In contrast to the first layer, where basis functions in A correspond to specific visual features, higher-order basis functions in B describe the shapes of image distributions. [sent-87, score-1.314]
53 Assuming independence between B and v, the marginal likelihood is p(x|A, B) = p(u|B, v)p(v)/| det A|dv . [sent-90, score-0.179]
54 We adapt B by maximizing the likelihood over the data ensemble B = argmax ∑ log p(un |B, vn ) + log p(B) ˆ (11) B n For reasons of space, we omit the (straightforward) derivations of the gradients. [sent-93, score-0.2]
55 Figure 2: A subset of the 400 image basis functions. [sent-94, score-0.665]
56 3 Results The algorithm described above was applied to a standard set of ten 512×512 natural images used in [2]. [sent-96, score-0.303]
57 For computational simplicity, prior to the adaptation of the higher-order basis B, a 20 × 20 ICA image basis was derived using standard methods (e. [sent-97, score-0.993]
58 A subset of these basis functions is shown in fig. [sent-100, score-0.413]
59 Because of the computational complexity of the learning procedure, the number of basis functions in B was limited to 30, although in principle a complete basis of 400 could be learned. [sent-102, score-0.794]
60 The basis B was initialized to small random values and gradient ascent was performed for 4000 iterations, with a fixed step size of 0. [sent-103, score-0.43]
61 For each batch of 5000 randomly sampled image patches, v was derived using 50 steps of gradient ascent at a fixed step size ˆ of 0. [sent-105, score-0.439]
62 3 shows three different representations of the basis functions in the matrix B adapted to natural images. [sent-108, score-0.66]
63 3a) shows the values of the 30 basis functions in B in their original learned order. [sent-110, score-0.471]
64 Each square represents 400 weights B i, j from a particular v j to all the image basis functions ui ’s. [sent-111, score-0.965]
65 In this representation, the weights appear sparse, but otherwise show no apparent structure, simply because basis functions in A are unordered. [sent-113, score-0.471]
66 3b, the dots representing the same weights are arranged according to the spatial location within an image patch (as determined by fitting a 2D Gabor function) of the basis function which the weight affects. [sent-117, score-1.143]
67 Each weight is shown as a dot; white dots represent positive weights, black dots negative weights. [sent-118, score-0.184]
68 3c, the same weights are arranged according to the orientation and spatial scale of the Gaussian envelope of the fitted Gabor. [sent-120, score-0.295]
69 Orientation ranges from 0 to π counter-clockwise from the horizontal axis, and spatial scale ranges radially from DC at the bottom center to Nyquist. [sent-121, score-0.25]
70 (Note that the learned basis functions can only be approximately fit by Gabor functions, which limits the precision of the visualizations. [sent-122, score-0.471]
71 ) In these arrangements, several types of higher-order regularities emerge. [sent-123, score-0.19]
72 The predominant one is that coefficient variances are spatially correlated, which reflects the fact that a common occurrence is an image patch with a small localized object against a relatively uniform background. [sent-124, score-0.666]
73 3b shows that often the coefficient variances in the top and bottom halves of the image patch are anti-correlated, i. [sent-126, score-0.602]
74 Because vi can be positive or negative, the higher-order basis functions in B represent contrast in the variance patterns. [sent-129, score-0.574]
75 Other common regularities are variance-contrasts between two orientations for all spatial positions (e. [sent-130, score-0.331]
76 row 7, column 1) and between low and high spatial scales for all positions and orientations (e. [sent-132, score-0.216]
77 Most higher-order basis functions have simple structure in either position, orientation, or scale, but there are some whose organization is less obvious. [sent-135, score-0.484]
78 a b c Figure 3: The learned higher-order basis functions. [sent-136, score-0.386]
79 The same weights shown in the original order (a); rearranged according to the spatial location of the corresponding image basis functions (b); rearranged according to frequency and orientation of image basis functions (c). [sent-137, score-1.954]
80 Figure 4: Image patches that yielded the largest coefficients for two basis functions in B. [sent-139, score-0.527]
81 The central block contains nine image patches corresponding to higher-order basis function coefficients with values near zero, i. [sent-140, score-0.822]
82 Positions of other nine-patch blocks correspond to the associated values of higher-order coefficients, here v15 and v27 (whose weights to ui ’s are shown at the axes extrema). [sent-143, score-0.215]
83 For example, the upper-left block contains image patches for which v 15 was highly negative (contrast localized to bottom half of patch) and v27 was highly positive (power predominantly at low spatial scales). [sent-144, score-0.666]
84 This illustrates how different combinations of basis functions in B define distributions of images (in this case, spatial frequency and location). [sent-145, score-0.685]
85 Another way to get insight into the code learned by the model is to display, for a large ensemble of image patches, the patches that yield the largest values of particular v i ’s (and their corresponding basis functions in B). [sent-146, score-1.11]
86 As a check to see if any of the higher-order structure learned by the algorithm was simply due to random variations in the dataset, we generated a dataset by drawing independent samples un from a generalized Gaussian to produce the pattern xn = Aun . [sent-149, score-0.345]
87 The resulting basis B was composed only of small random values, indicating essentially no deviation from the standard assumption of independence and unit variance. [sent-150, score-0.424]
88 In addition, adapting the model on a synthetic dataset generated from a hand-specified B recovers the original higher-order basis functions. [sent-151, score-0.38]
89 To check the validity of first deriving B for a fixed A, both matrices were adapted simultaneously for small 8 × 8 patches on the same natural image data set. [sent-153, score-0.669]
90 The results for both the image basis matrix A and the higher-order basis B were qualitatively similar to those reported above. [sent-154, score-0.993]
91 4 Discussion We have presented a model for learning higher-order statistical regularities in natural images by learning an efficient, sparse-distributed code for the basis function coefficient variances. [sent-155, score-0.946]
92 One salient type of higher-order structure learned by the model is the position of image structure within the patch. [sent-159, score-0.581]
93 It is interesting that, rather than encoding specific locations, the model learned a coarse code of position using broadly tuned spatial patterns. [sent-160, score-0.253]
94 This could offer novel insights into the function of the broad tuning of higher level visual neurons. [sent-161, score-0.236]
95 Emergence of simple-cell receptive-field properties by learning a sparse code for natural images. [sent-176, score-0.34]
96 The ’independent components’ of natural scenes are edge filters. [sent-183, score-0.193]
97 Independent component filters of natural images compared with simple cells in primary visual cortex. [sent-190, score-0.594]
98 Independent component analysis of natural image sequences yield spatiotemporal filters similar to simple cells in primary visual cortex. [sent-200, score-0.753]
99 Unsupervised classification, segmentation and de-noising of images using ICA mixture models. [sent-223, score-0.213]
100 A multi-layer sparse coding network learns contour coding from natural images. [sent-231, score-0.549]
wordName wordTfidf (topN-words)
[('image', 0.337), ('basis', 0.328), ('coef', 0.3), ('cients', 0.199), ('images', 0.178), ('coding', 0.158), ('ui', 0.157), ('regularities', 0.155), ('visual', 0.151), ('bv', 0.145), ('patch', 0.136), ('natural', 0.125), ('ica', 0.124), ('ef', 0.119), ('patches', 0.114), ('code', 0.101), ('independence', 0.096), ('rearranged', 0.095), ('spatial', 0.094), ('dots', 0.092), ('variances', 0.092), ('ensemble', 0.087), ('functions', 0.085), ('dependencies', 0.083), ('cient', 0.082), ('variance', 0.081), ('vi', 0.08), ('among', 0.078), ('argmax', 0.076), ('capture', 0.076), ('schwartz', 0.073), ('structure', 0.071), ('simoncelli', 0.069), ('scenes', 0.068), ('representations', 0.065), ('ascent', 0.064), ('higherorder', 0.063), ('sparse', 0.062), ('object', 0.06), ('statistical', 0.059), ('default', 0.058), ('weights', 0.058), ('hateren', 0.058), ('hoyer', 0.058), ('yan', 0.058), ('sensory', 0.058), ('learned', 0.058), ('cells', 0.057), ('adapted', 0.057), ('orientation', 0.057), ('location', 0.055), ('limited', 0.053), ('adapting', 0.052), ('properties', 0.052), ('deviates', 0.051), ('joint', 0.05), ('scene', 0.05), ('gabor', 0.048), ('codes', 0.048), ('computations', 0.048), ('independent', 0.047), ('lewicki', 0.046), ('un', 0.046), ('det', 0.046), ('learns', 0.046), ('primary', 0.045), ('generalized', 0.045), ('insights', 0.044), ('salient', 0.044), ('relationships', 0.043), ('positions', 0.043), ('block', 0.043), ('fundamentally', 0.043), ('arranged', 0.043), ('scale', 0.043), ('xn', 0.042), ('ciency', 0.042), ('learn', 0.042), ('vision', 0.042), ('localized', 0.041), ('offers', 0.041), ('offer', 0.041), ('row', 0.04), ('derive', 0.039), ('orientations', 0.039), ('correlations', 0.039), ('component', 0.038), ('ranges', 0.038), ('gradient', 0.038), ('van', 0.038), ('likelihood', 0.037), ('statistics', 0.037), ('bottom', 0.037), ('range', 0.036), ('check', 0.036), ('wide', 0.036), ('types', 0.035), ('segmentation', 0.035), ('response', 0.035), ('deviations', 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999994 10 nips-2002-A Model for Learning Variance Components of Natural Images
Author: Yan Karklin, Michael S. Lewicki
Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.
2 0.28157124 2 nips-2002-A Bilinear Model for Sparse Coding
Author: David B. Grimes, Rajesh P. Rao
Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.
3 0.24878679 126 nips-2002-Learning Sparse Multiscale Image Representations
Author: Phil Sallee, Bruno A. Olshausen
Abstract: We describe a method for learning sparse multiscale image representations using a sparse prior distribution over the basis function coefficients. The prior consists of a mixture of a Gaussian and a Dirac delta function, and thus encourages coefficients to have exact zero values. Coefficients for an image are computed by sampling from the resulting posterior distribution with a Gibbs sampler. The learned basis is similar to the Steerable Pyramid basis, and yields slightly higher SNR for the same number of active coefficients. Denoising using the learned image model is demonstrated for some standard test images, with results that compare favorably with other denoising methods. 1
4 0.23308904 14 nips-2002-A Probabilistic Approach to Single Channel Blind Signal Separation
Author: Gil-jin Jang, Te-Won Lee
Abstract: We present a new technique for achieving source separation when given only a single channel recording. The main idea is based on exploiting the inherent time structure of sound sources by learning a priori sets of basis filters in time domain that encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum likelihood approach given the observed single channel data and sets of basis filters. For each time point we infer the source signals and their contribution factors. This inference is possible due to the prior knowledge of the basis filters and the associated coefficient densities. A flexible model for density estimation allows accurate modeling of the observation and our experimental results exhibit a high level of separation performance for mixtures of two music signals as well as the separation of two voice signals.
5 0.21495171 39 nips-2002-Bayesian Image Super-Resolution
Author: Michael E. Tipping, Christopher M. Bishop
Abstract: The extraction of a single high-quality image from a set of lowresolution images is an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction of still images from video. Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution image to the observed low resolution images, using regularization to resolve the ill-posed nature of the inversion process. In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parameters is based on a marginalization over the unknown high-resolution image. This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. 1
6 0.20810527 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
7 0.18742371 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
8 0.18232279 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
9 0.1796267 173 nips-2002-Recovering Intrinsic Images from a Single Image
11 0.17039926 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
12 0.16368429 202 nips-2002-Unsupervised Color Constancy
13 0.16292791 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
14 0.15431678 74 nips-2002-Dynamic Structure Super-Resolution
15 0.1470934 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes
16 0.13838667 182 nips-2002-Shape Recipes: Scene Representations that Refer to the Image
17 0.13152447 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
18 0.1254918 19 nips-2002-Adapting Codes and Embeddings for Polychotomies
19 0.12037381 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
20 0.11723334 38 nips-2002-Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement
topicId topicWeight
[(0, -0.323), (1, 0.08), (2, -0.0), (3, 0.437), (4, 0.021), (5, -0.111), (6, 0.139), (7, -0.005), (8, 0.033), (9, -0.062), (10, 0.028), (11, -0.096), (12, 0.062), (13, 0.039), (14, 0.089), (15, 0.096), (16, 0.092), (17, -0.032), (18, -0.088), (19, 0.084), (20, -0.067), (21, -0.103), (22, -0.088), (23, 0.04), (24, 0.069), (25, -0.099), (26, 0.001), (27, 0.071), (28, 0.208), (29, 0.03), (30, -0.025), (31, -0.014), (32, -0.065), (33, 0.092), (34, -0.072), (35, -0.055), (36, 0.001), (37, -0.04), (38, -0.037), (39, -0.079), (40, -0.041), (41, -0.022), (42, -0.095), (43, -0.028), (44, -0.036), (45, -0.078), (46, -0.01), (47, 0.029), (48, -0.076), (49, -0.05)]
simIndex simValue paperId paperTitle
same-paper 1 0.99114162 10 nips-2002-A Model for Learning Variance Components of Natural Images
Author: Yan Karklin, Michael S. Lewicki
Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.
2 0.92974621 2 nips-2002-A Bilinear Model for Sparse Coding
Author: David B. Grimes, Rajesh P. Rao
Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.
3 0.85489774 126 nips-2002-Learning Sparse Multiscale Image Representations
Author: Phil Sallee, Bruno A. Olshausen
Abstract: We describe a method for learning sparse multiscale image representations using a sparse prior distribution over the basis function coefficients. The prior consists of a mixture of a Gaussian and a Dirac delta function, and thus encourages coefficients to have exact zero values. Coefficients for an image are computed by sampling from the resulting posterior distribution with a Gibbs sampler. The learned basis is similar to the Steerable Pyramid basis, and yields slightly higher SNR for the same number of active coefficients. Denoising using the learned image model is demonstrated for some standard test images, with results that compare favorably with other denoising methods. 1
4 0.64482015 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
Author: Jarmo Hurri, Aapo Hyvärinen
Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.
5 0.64377505 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
Author: Max Welling, Simon Osindero, Geoffrey E. Hinton
Abstract: We propose a model for natural images in which the probability of an image is proportional to the product of the probabilities of some filter outputs. We encourage the system to find sparse features by using a Studentt distribution to model each filter output. If the t-distribution is used to model the combined outputs of sets of neurally adjacent filters, the system learns a topographic map in which the orientation, spatial frequency and location of the filters change smoothly across the map. Even though maximum likelihood learning is intractable in our model, the product form allows a relatively efficient learning procedure that works well even for highly overcomplete sets of filters. Once the model has been learned it can be used as a prior to derive the “iterated Wiener filter” for the purpose of denoising images.
6 0.59839636 14 nips-2002-A Probabilistic Approach to Single Channel Blind Signal Separation
7 0.5978778 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes
8 0.5831883 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
9 0.55500346 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
10 0.53589416 173 nips-2002-Recovering Intrinsic Images from a Single Image
11 0.51770914 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
12 0.51339012 202 nips-2002-Unsupervised Color Constancy
14 0.50163281 39 nips-2002-Bayesian Image Super-Resolution
15 0.50101823 182 nips-2002-Shape Recipes: Scene Representations that Refer to the Image
16 0.46516022 150 nips-2002-Multiple Cause Vector Quantization
17 0.46453807 38 nips-2002-Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement
18 0.45568797 87 nips-2002-Fast Transformation-Invariant Factor Analysis
19 0.44398096 110 nips-2002-Incremental Gaussian Processes
20 0.4248783 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
topicId topicWeight
[(3, 0.014), (11, 0.017), (23, 0.041), (26, 0.05), (42, 0.087), (54, 0.161), (55, 0.094), (57, 0.014), (64, 0.046), (67, 0.029), (68, 0.03), (74, 0.135), (92, 0.048), (98, 0.139)]
simIndex simValue paperId paperTitle
same-paper 1 0.97386307 10 nips-2002-A Model for Learning Variance Components of Natural Images
Author: Yan Karklin, Michael S. Lewicki
Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.
2 0.96370447 2 nips-2002-A Bilinear Model for Sparse Coding
Author: David B. Grimes, Rajesh P. Rao
Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.
3 0.94971609 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
Author: Jarmo Hurri, Aapo Hyvärinen
Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.
4 0.93898106 52 nips-2002-Cluster Kernels for Semi-Supervised Learning
Author: Olivier Chapelle, Jason Weston, Bernhard SchĂślkopf
Abstract: We propose a framework to incorporate unlabeled data in kernel classifier, based on the idea that two points in the same cluster are more likely to have the same label. This is achieved by modifying the eigenspectrum of the kernel matrix. Experimental results assess the validity of this approach. 1
5 0.9360044 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons
Author: Elad Schneidman, William Bialek, Michael Ii
Abstract: A population of neurons typically exhibits a broad diversity of responses to sensory inputs. The intuitive notion of functional classification is that cells can be clustered so that most of the diversity is captured by the identity of the clusters rather than by individuals within clusters. We show how this intuition can be made precise using information theory, without any need to introduce a metric on the space of stimuli or responses. Applied to the retinal ganglion cells of the salamander, this approach recovers classical results, but also provides clear evidence for subclasses beyond those identified previously. Further, we find that each of the ganglion cells is functionally unique, and that even within the same subclass only a few spikes are needed to reliably distinguish between cells. 1
6 0.93516123 68 nips-2002-Discriminative Densities from Maximum Contrast Estimation
7 0.93339944 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
8 0.93267512 3 nips-2002-A Convergent Form of Approximate Policy Iteration
9 0.93079889 24 nips-2002-Adaptive Scaling for Feature Selection in SVMs
10 0.92904633 53 nips-2002-Clustering with the Fisher Score
11 0.92883766 27 nips-2002-An Impossibility Theorem for Clustering
12 0.92844021 88 nips-2002-Feature Selection and Classification on Matrix Data: From Large Margins to Small Covering Numbers
13 0.92819798 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals
14 0.92519087 188 nips-2002-Stability-Based Model Selection
15 0.9225468 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
16 0.92208314 93 nips-2002-Forward-Decoding Kernel-Based Phone Recognition
17 0.92165625 48 nips-2002-Categorization Under Complexity: A Unified MDL Account of Human Learning of Regular and Irregular Categories
18 0.92013764 89 nips-2002-Feature Selection by Maximum Marginal Diversity
19 0.91987395 203 nips-2002-Using Tarjan's Red Rule for Fast Dependency Tree Construction
20 0.91969085 37 nips-2002-Automatic Derivation of Statistical Algorithms: The EM Family and Beyond