nips nips2002 nips2002-2 knowledge-graph by maker-knowledge-mining

2 nips-2002-A Bilinear Model for Sparse Coding


Source: pdf

Author: David B. Grimes, Rajesh P. Rao

Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu   ¡ Abstract Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. [sent-9, score-0.671]

2 However, these approaches do not take image transformations into account. [sent-10, score-0.32]

3 As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. [sent-11, score-0.3]

4 We describe an algorithm for sparse coding based on a bilinear generative model of images. [sent-12, score-1.206]

5 By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. [sent-13, score-1.441]

6 We present results demonstrating bilinear sparse coding of natural images. [sent-14, score-1.185]

7 We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition. [sent-15, score-0.29]

8 1 Introduction Algorithms for redundancy reduction and efficient coding have been the subject of considerable attention in recent years [6, 3, 4, 7, 9, 5, 11]. [sent-16, score-0.247]

9 Although the basic ideas can be traced to the early work of Attneave [1] and Barlow [2], recent techniques such as independent component analysis (ICA) and sparse coding have helped formalize these ideas and have demonstrated the feasibility of efficient coding through redundancy reduction. [sent-17, score-0.658]

10 These techniques produce an efficient code by attempting to minimize the dependencies between elements of the code by using appropriate constraints. [sent-18, score-0.127]

11 One of the most successful applications of ICA and sparse coding has been in the area of image coding. [sent-19, score-0.512]

12 Olshausen and Field showed that sparse coding of natural images produces localized, oriented basis filters that resemble the receptive fields of simple cells in primary visual cortex [6, 7]. [sent-20, score-0.759]

13 However, these approaches do not take image transformations into account. [sent-22, score-0.32]

14 As a result, the same oriented feature is often learned at different locations, yielding a redundant code. [sent-23, score-0.203]

15 Moreover, the presence of the same feature at multiple locations prevents more complex features from being learned and leads to a combinatorial explosion when one attempts to scale the approach to large image patches or hierarchical networks. [sent-24, score-0.431]

16 In this paper, we propose an approach to sparse coding that explicitly models the interac- tion between image features and their transformations. [sent-25, score-0.624]

17 A bilinear generative model is used to learn both the independent features in an image as well as their transformations. [sent-26, score-1.177]

18 Our approach extends Tenenbaum and Freeman’s work on bilinear models for learning content and style [12] by casting the problem within probabilistic sparse coding framework. [sent-27, score-1.28]

19 Thus, whereas prior work on bilinear models used global decomposition methods such as SVD, the approach presented here emphasizes the extraction of local features by removing higher-order redundancies through sparseness constraints. [sent-28, score-1.059]

20 We show that for natural images, this approach produces localized, oriented filters that can be translated by different amounts to account for image features at arbitrary locations. [sent-29, score-0.506]

21 Our results demonstrate how an image can be factored into a set of basic local features and their transformations, providing a basis for transformation-invariant vision. [sent-30, score-0.397]

22 We conclude by discussing how the approach can be extended to allow parts-based object recognition, wherein an object is modeled as a collection of local features (or “parts”) and their relative transformations. [sent-31, score-0.448]

23 2 Bilinear Generative Models We begin by considering the standard linear generative model used in algorithms for ICA and sparse coding [3, 7, 9]: ¥ ¥ (1) £ ¥¤ ¥ ©§ ¨¨¦ ¡ ¢     ¥ where is a -dimensional input vector (e. [sent-32, score-0.436]

24 an image), is a -dimensional basis vector and is its scalar coefficient. [sent-34, score-0.126]

25 Given the linear generative model above, the goal of ICA is to learn the basis vectors such that the are as independent as possible, while the goal in sparse coding is to make the distribution of highly kurtotic given Equation 1. [sent-35, score-0.661]

26  © ¥ ¥ ¥  ¥ ¥ © The linear generative model in Equation 1 can be extended to the bilinear case by using two independent sets of coefficients and (or equivalently, two vectors and ) [12]:   (2) ( ( ( ¥ ) © §&¦ ( &¦ § $ ! [sent-36, score-0.939]

27  ¡ ¡ %#"  ¢  ' ¤ £ ¥¤  ( ( ¥ ¥ ¥ ¥ The coefficients and jointly modulate a set of basis vectors to produce an input vector . [sent-37, score-0.175]

28 For the present study, the coefficient can be regarded as encoding the presence of object feature in the image while the values determine the transformation present in the image. [sent-38, score-0.428]

29 In the terminology of Tenenbaum and Freeman [12], describes the “content” of the image while encodes its “style. [sent-39, score-0.197]

30 ” ©  for a fixed :  ¥¥ (3) @ ©§ A)&¦ £ ¥¤ ¡  0    Equation 2 can also be expressed as a linear equation in ¥  ©6§&¦ ( &¦ § 2$   ¡ 79 8(  ( ¡ 3¨ %1  ' ¤ 45 £ ¥ ¤ ¥ Likewise, for a fixed , one obtains a linear equation in . [sent-40, score-0.084]

31 The power of bilinear models stems from the rich non-linear interactions that can be represented by varying both and simultaneously. [sent-42, score-0.851]

32 1 Our goal is to learn from image data an appropriate set of basis vectors that effectively describe the interactions between the feature vector and the transformation vector . [sent-44, score-0.513]

33 A standard approach to minimizing such where a function is to use gradient descent and alternate between minimization with respect to and minimization with respect to . [sent-48, score-0.086]

34 The function has many local minima and results from our simulations indicate that convergence is difficult in many cases. [sent-50, score-0.051]

35 There are many different ways to represent an image, making it difficult for the method to converge to a basis set that can generalize effectively. [sent-51, score-0.104]

36 Rather than using gradient descent, their method estimates the parameters directly by computing the singular value decomposition (SVD) of a matrix containing input data corresponding to each content class in every style . [sent-54, score-0.141]

37 Their approach can be regarded as an extension of methods based on principal component analysis (PCA) applied to the bilinear case. [sent-55, score-0.835]

38 The SVD approach avoids the difficulties of convergence that plague the gradient descent method and is much faster in practice. [sent-56, score-0.109]

39 Unfortunately, the learned features tend to be global and non-localized similar to those obtained from PCA-based methods based on second-order statistics. [sent-57, score-0.173]

40 As a result, the method is unsuitable for the problem of learning local features of objects and their transformations. [sent-58, score-0.186]

41    The underconstrained nature of the problem can be remedied by imposing constraints on and . [sent-59, score-0.029]

42    1 & $ '%3 5 @ 3 4  " and 1 ) ( & $ 20¢'% We assume the following priors for ( Bilinear Sparse Coding ¥ ¥ 3. [sent-65, score-0.026]

43 ¡ $  9 2 8 7 6 8 $ ¥ 9 D   B A C©@ ¥ §   Within a probabilistic framework, the squared error function summed over all images can be interpreted as representing the negative log likelihood of the data given the parame(see, for example, [7]). [sent-68, score-0.117]

44 The priors and can be used ters: to marginalize this likelihood to obtain the new likelihood function: . [sent-69, score-0.026]

45 1 Training Paradigm We tested the algorithms for bilinear sparse coding on natural image data. [sent-77, score-1.357]

46 The natural images we used are distributed by Olshausen and Field [7], along with the code for their patches randomly extracted from algorithm. [sent-78, score-0.22]

47 The training set of images consisted of ten source images. [sent-79, score-0.114]

48 The images are pre-whitened to equalize large variances in frequency, and thus speed convergence. [sent-80, score-0.114]

49 We choose to use a complete basis where and we let be at least as large as the number of transformations (including the notransformation case). [sent-81, score-0.252]

50 In order to assist convergence all learning occurs in batch mode, where the batch consisted image patches. [sent-83, score-0.322]

51 The step size for gradient descent using Equation 11 was of set to . [sent-84, score-0.086]

52 The transformations were chosen to be 2D translations in the range pixels in both the axes. [sent-85, score-0.215]

53 The style/content separation was enforced by learning a single vector to describe an image patch regardless its translation, and likewise a single vector to describe a particular style given any image patch content. [sent-86, score-0.748]

54 2 Bilinear Sparse Coding of Natural Images Figure 1 shows the results of training on natural image data. [sent-92, score-0.225]

55 (a) A comparison of learned features between a standard linear model and a bilinear model, both trained with the same sparseness priors. [sent-94, score-1.036]

56 The two rows for the bilinear case depict the translated object features w (see Equation 3) for translations of pixels. [sent-95, score-1.169]

57 (b) The representation of an example natural image patch, and of the same patch translated to the left. [sent-96, score-0.445]

58 Note that the bar plot representing the vector is indeed sparse, having only three significant coefficients. [sent-97, score-0.067]

59 The code for the style vectors for both the canonical patch, and the translated one is likewise sparse. [sent-98, score-0.323]

60 The basis images are shown for those dimensions which have non-zero coefficients for or . [sent-99, score-0.198]

61 Although both show simple, localized, and oriented features, the bilinear method is able to model the same features under different transformations. [sent-103, score-0.96]

62 In this case, the range horizontal translations were used in the training of the bilinear model. [sent-104, score-0.832]

63 Figure 1 (b) provides an example of how the bilinear sparse coding model encodes a natural image patch and the same patch after it has been translated. [sent-105, score-1.638]

64 ¡0   £   Figure 2 shows how the model can account for a given localized feature at different locations by varying the y vector. [sent-108, score-0.202]

65 As shown in the last column of the figure, the translated local feature is generated by linearly combining a sparse set of basis vectors . [sent-109, score-0.49]

66 3 The bilinear generative model in Equation 2 uses the same set of transformation values for all the features . [sent-111, score-1.011]

67 Such a model is appropriate for global transformations " ! [sent-112, score-0.176]

68  ¡ 0 Selected transformations Feature 1 (x57 ) y(−1,+2) wi j y(0,3) Feature 2 (x32 ) y(−2,0) wi j y(+1,0) j= 1 2 3 4 5 6 7 . [sent-114, score-0.366]

69 j= 1 8 8 ¥ Figure 2: Translating a learned feature to multiple locations. [sent-117, score-0.101]

70 The two rows of eight images represent the individual basis vectors for two values of . [sent-118, score-0.266]

71 The values for two selected transformations for each are shown as bar plots. [sent-119, score-0.17]

72 denotes a translation of pixels in the Cartesian plane. [sent-120, score-0.049]

73 The last column shows the resulting basis vectors after translation. [sent-121, score-0.153]

74    9 that apply to an entire image region such as a shift of pixels for an image patch or a global illumination change. [sent-124, score-0.527]

75 £ Consider the problem of representing an object in terms of its constituent parts. [sent-125, score-0.156]

76 In this case, we would like to be able to transform each part independently of other parts in order to account for the location, orientation, and size of each part in the object image. [sent-126, score-0.19]

77 The standard bilinear model can be extended to address this need as follows: (12) $(( ¥ ¥ ¥ ¥ ( ( ¡ ( ¥ © §&¦ ( &¦ §  ¡ 1  ' ¤ £ ¥¤ Note that each object feature now has its own set of transformation values . [sent-127, score-1.026]

78 0 0 We have conducted preliminary experiments to test the feasibility of Equation 12 using a set of object features learned for the standard bilinear model. [sent-130, score-1.096]

79 These results suggest that allowing independent transformations for the different features provides a rich substrate for modeling images and objects in terms of a set of local features (or parts) and their individual transformations. [sent-133, score-0.543]

80 5 Summary and Conclusion A fundamental problem in vision is to simultaneously recognize objects and their transformations [8, 10]. [sent-134, score-0.213]

81 Bilinear generative models provide a tractable way of addressing this problem by factoring an image into object features and transformations using a bilinear equation. [sent-135, score-1.431]

82 Previous approaches used unconstrained bilinear models and produced global basis vectors for image representation [12]. [sent-136, score-1.164]

83 In contrast, recent research on image coding has stressed the importance of localized, independent features derived from metrics that emphasize the higher-order statistics of inputs [6, 3, 7, 5]. [sent-137, score-0.524]

84 This paper introduces a new probabilistic framework for learning bilinear generative models based on the idea of sparse coding. [sent-138, score-1.034]

85 Our results demonstrate that bilinear sparse coding of natural images produces localized oriented basis vectors that can simultaneously represent features in an image and their transformation. [sent-139, score-1.902]

86 We showed how the learned generative model can be used to translate a (a) y w81 x81 y(0,1) x 57 y w57 z 81 x57 y(0,1) (b) ∑ w81 j y81 j   x81 ¡ y81 y(−2,0) z y(0,1) ¡ y57 y(−2,0) x57 y(0,1) y(1,0) y(1,1) y(1,0) y(1,1) ¥ Figure 3: Modeling independently transformed features. [sent-140, score-0.145]

87 (a) shows the standard bilinear method of generating a translated feature by combining basis vectors using the same set of values for two different features ( and ). [sent-141, score-1.198]

88 (b) shows four examples of images generated by allowing different values of for the two different features. [sent-142, score-0.094]

89 Note the significant differences between the resulting images, which cannot be obtained using the standard bilinear model. [sent-143, score-0.792]

90  ¡ 0 ( basis vector to different locations, thereby reducing the need to learn the same basis vector at multiple locations as in traditional sparse coding methods. [sent-145, score-0.677]

91 We also proposed an extension of the bilinear model that allows each feature to be transformed independently of other features. [sent-146, score-0.881]

92 Our preliminary results suggest that such an approach could provide a flexible platform for adaptive parts-based object recognition, wherein objects are described by a set of independent, shared parts and their transformations. [sent-147, score-0.318]

93 The importance of parts-based methods has long been recognized in object recognition in view of their ability to handle a combinatorially large number of objects by combining parts and their transformations. [sent-148, score-0.276]

94 Few methods, if any, exist for learning representations of object parts and their transformations directly from images. [sent-149, score-0.338]

95 Our ongoing efforts are therefore focused on deriving efficient algorithms for parts-based object recognition based on the combination of bilinear models and sparse coding. [sent-150, score-1.116]

96 The ‘independent components’ of natural scenes are edge filters. [sent-170, score-0.053]

97 Emergence of simple-cell receptive field properties by learning a sparse code for natural images. [sent-190, score-0.295]

98 Sparse coding with an overcomplete basis set: A strategy employed by V1? [sent-197, score-0.324]

99 Development of localized oriented receptive fields by learning a translation-invariant code for natural images. [sent-205, score-0.33]

100 Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive field effects. [sent-213, score-0.263]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('bilinear', 0.792), ('coding', 0.191), ('image', 0.172), ('sparse', 0.149), ('transformations', 0.148), ('object', 0.133), ('patch', 0.128), ('localized', 0.109), ('wi', 0.109), ('basis', 0.104), ('sparseness', 0.099), ('images', 0.094), ('features', 0.093), ('translated', 0.092), ('ica', 0.082), ('olshausen', 0.076), ('oriented', 0.075), ('generative', 0.074), ('rao', 0.071), ('style', 0.068), ('parts', 0.057), ('redundancy', 0.056), ('descent', 0.055), ('natural', 0.053), ('learned', 0.052), ('code', 0.052), ('transformation', 0.052), ('tenenbaum', 0.05), ('coef', 0.049), ('feature', 0.049), ('vectors', 0.049), ('kurtotic', 0.048), ('locations', 0.044), ('batch', 0.044), ('svd', 0.044), ('objects', 0.044), ('equation', 0.042), ('content', 0.042), ('wherein', 0.042), ('receptive', 0.041), ('freeman', 0.04), ('translations', 0.04), ('likewise', 0.036), ('field', 0.035), ('gradient', 0.031), ('visual', 0.031), ('imposing', 0.029), ('overcomplete', 0.029), ('cients', 0.029), ('sensory', 0.028), ('global', 0.028), ('local', 0.028), ('pixels', 0.027), ('redundant', 0.027), ('priors', 0.026), ('feasibility', 0.026), ('canonical', 0.026), ('inputs', 0.025), ('encodes', 0.025), ('growing', 0.024), ('bell', 0.024), ('mode', 0.024), ('independent', 0.024), ('representing', 0.023), ('convergence', 0.023), ('minimize', 0.023), ('recognition', 0.023), ('regarded', 0.022), ('translation', 0.022), ('bar', 0.022), ('vector', 0.022), ('equivalently', 0.022), ('learn', 0.022), ('produces', 0.021), ('interactions', 0.021), ('patches', 0.021), ('extension', 0.021), ('shared', 0.021), ('attneave', 0.021), ('informational', 0.021), ('platform', 0.021), ('rosenblith', 0.021), ('traced', 0.021), ('unsuitable', 0.021), ('vision', 0.021), ('consisted', 0.02), ('variances', 0.02), ('thereby', 0.019), ('combining', 0.019), ('assist', 0.019), ('barlow', 0.019), ('casting', 0.019), ('discussing', 0.019), ('stressed', 0.019), ('translating', 0.019), ('models', 0.019), ('rich', 0.019), ('rows', 0.019), ('transformed', 0.019), ('elds', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 2 nips-2002-A Bilinear Model for Sparse Coding

Author: David B. Grimes, Rajesh P. Rao

Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.

2 0.28157124 10 nips-2002-A Model for Learning Variance Components of Natural Images

Author: Yan Karklin, Michael S. Lewicki

Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.

3 0.16072565 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning

Author: Stella X. Yu, Ralph Gross, Jianbo Shi

Abstract: Segmentation and recognition have long been treated as two separate processes. We propose a mechanism based on spectral graph partitioning that readily combine the two processes into one. A part-based recognition system detects object patches, supplies their partial segmentations as well as knowledge about the spatial configurations of the object. The goal of patch grouping is to find a set of patches that conform best to the object configuration, while the goal of pixel grouping is to find a set of pixels that have the best low-level feature similarity. Through pixel-patch interactions and between-patch competition encoded in the solution space, these two processes are realized in one joint optimization problem. The globally optimal partition is obtained by solving a constrained eigenvalue problem. We demonstrate that the resulting object segmentation eliminates false positives for the part detection, while overcoming occlusion and weak contours for the low-level edge detection.

4 0.12678787 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex

Author: Jarmo Hurri, Aapo Hyvärinen

Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.

5 0.11785758 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search

Author: Christopher Williams, Michalis K. Titsias

Abstract: We consider data which are images containing views of multiple objects. Our task is to learn about each of the objects present in the images. This task can be approached as a factorial learning problem, where each image must be explained by instantiating a model for each of the objects present with the correct instantiation parameters. A major problem with learning a factorial model is that as the number of objects increases, there is a combinatorial explosion of the number of configurations that need to be considered. We develop a method to extract object models sequentially from the data by making use of a robust statistical method, thus avoiding the combinatorial explosion, and present results showing successful extraction of objects from real images.

6 0.11587991 126 nips-2002-Learning Sparse Multiscale Image Representations

7 0.11518933 39 nips-2002-Bayesian Image Super-Resolution

8 0.11510459 118 nips-2002-Kernel-Based Extraction of Slow Features: Complex Cells Learn Disparity and Translation Invariance from Natural Images

9 0.10801069 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

10 0.10509691 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

11 0.086919852 87 nips-2002-Fast Transformation-Invariant Factor Analysis

12 0.086739585 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

13 0.086512774 14 nips-2002-A Probabilistic Approach to Single Channel Blind Signal Separation

14 0.081602916 173 nips-2002-Recovering Intrinsic Images from a Single Image

15 0.079324916 202 nips-2002-Unsupervised Color Constancy

16 0.078974664 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities

17 0.074139982 74 nips-2002-Dynamic Structure Super-Resolution

18 0.070867762 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes

19 0.065503471 88 nips-2002-Feature Selection and Classification on Matrix Data: From Large Margins to Small Covering Numbers

20 0.061743457 19 nips-2002-Adapting Codes and Embeddings for Polychotomies


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.187), (1, 0.029), (2, 0.007), (3, 0.298), (4, 0.009), (5, -0.073), (6, 0.135), (7, -0.056), (8, 0.001), (9, -0.017), (10, -0.001), (11, -0.059), (12, 0.039), (13, 0.045), (14, 0.06), (15, 0.087), (16, 0.1), (17, -0.03), (18, -0.05), (19, 0.091), (20, -0.097), (21, -0.073), (22, -0.018), (23, 0.012), (24, 0.006), (25, -0.094), (26, 0.03), (27, -0.021), (28, 0.185), (29, 0.106), (30, -0.01), (31, 0.023), (32, -0.121), (33, 0.054), (34, -0.215), (35, -0.096), (36, -0.001), (37, -0.001), (38, -0.002), (39, -0.071), (40, -0.032), (41, -0.045), (42, -0.09), (43, -0.021), (44, -0.006), (45, -0.116), (46, -0.013), (47, -0.002), (48, -0.095), (49, -0.152)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95951712 2 nips-2002-A Bilinear Model for Sparse Coding

Author: David B. Grimes, Rajesh P. Rao

Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.

2 0.84945887 10 nips-2002-A Model for Learning Variance Components of Natural Images

Author: Yan Karklin, Michael S. Lewicki

Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.

3 0.67461896 126 nips-2002-Learning Sparse Multiscale Image Representations

Author: Phil Sallee, Bruno A. Olshausen

Abstract: We describe a method for learning sparse multiscale image representations using a sparse prior distribution over the basis function coefficients. The prior consists of a mixture of a Gaussian and a Dirac delta function, and thus encourages coefficients to have exact zero values. Coefficients for an image are computed by sampling from the resulting posterior distribution with a Gibbs sampler. The learned basis is similar to the Steerable Pyramid basis, and yields slightly higher SNR for the same number of active coefficients. Denoising using the learned image model is demonstrated for some standard test images, with results that compare favorably with other denoising methods. 1

4 0.62002593 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning

Author: Stella X. Yu, Ralph Gross, Jianbo Shi

Abstract: Segmentation and recognition have long been treated as two separate processes. We propose a mechanism based on spectral graph partitioning that readily combine the two processes into one. A part-based recognition system detects object patches, supplies their partial segmentations as well as knowledge about the spatial configurations of the object. The goal of patch grouping is to find a set of patches that conform best to the object configuration, while the goal of pixel grouping is to find a set of pixels that have the best low-level feature similarity. Through pixel-patch interactions and between-patch competition encoded in the solution space, these two processes are realized in one joint optimization problem. The globally optimal partition is obtained by solving a constrained eigenvalue problem. We demonstrate that the resulting object segmentation eliminates false positives for the part detection, while overcoming occlusion and weak contours for the low-level edge detection.

5 0.52481437 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex

Author: Jarmo Hurri, Aapo Hyvärinen

Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.

6 0.51287091 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search

7 0.48881325 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

8 0.45910046 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

9 0.41376629 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes

10 0.40868348 118 nips-2002-Kernel-Based Extraction of Slow Features: Complex Cells Learn Disparity and Translation Invariance from Natural Images

11 0.40188119 14 nips-2002-A Probabilistic Approach to Single Channel Blind Signal Separation

12 0.38106701 190 nips-2002-Stochastic Neighbor Embedding

13 0.37903601 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

14 0.36758211 87 nips-2002-Fast Transformation-Invariant Factor Analysis

15 0.35380498 131 nips-2002-Learning to Classify Galaxy Shapes Using the EM Algorithm

16 0.34980005 110 nips-2002-Incremental Gaussian Processes

17 0.3432155 150 nips-2002-Multiple Cause Vector Quantization

18 0.33305225 173 nips-2002-Recovering Intrinsic Images from a Single Image

19 0.31671971 202 nips-2002-Unsupervised Color Constancy

20 0.29745683 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(1, 0.017), (11, 0.023), (14, 0.03), (23, 0.028), (42, 0.067), (54, 0.13), (55, 0.095), (64, 0.053), (67, 0.025), (68, 0.012), (72, 0.105), (74, 0.166), (92, 0.028), (98, 0.109)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93021923 2 nips-2002-A Bilinear Model for Sparse Coding

Author: David B. Grimes, Rajesh P. Rao

Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.

2 0.88467109 145 nips-2002-Mismatch String Kernels for SVM Protein Classification

Author: Eleazar Eskin, Jason Weston, William S. Noble, Christina S. Leslie

Abstract: We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem. These kernels measure sequence similarity based on shared occurrences of -length subsequences, counted with up to mismatches, and do not rely on any generative model for the positive training sequences. We compute the kernels efficiently using a mismatch tree data structure and report experiments on a benchmark SCOP dataset, where we show that the mismatch kernel used with an SVM classifier performs as well as the Fisher kernel, the most successful method for remote homology detection, while achieving considerable computational savings. ¡ ¢

3 0.87668854 10 nips-2002-A Model for Learning Variance Components of Natural Images

Author: Yan Karklin, Michael S. Lewicki

Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.

4 0.86660409 178 nips-2002-Robust Novelty Detection with Single-Class MPM

Author: Laurent E. Ghaoui, Michael I. Jordan, Gert R. Lanckriet

Abstract: In this paper we consider the problem of novelty detection, presenting an algorithm that aims to find a minimal region in input space containing a fraction 0: of the probability mass underlying a data set. This algorithm- the

5 0.86116195 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex

Author: Jarmo Hurri, Aapo Hyvärinen

Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.

6 0.85974967 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

7 0.85875869 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search

8 0.85808009 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons

9 0.85602564 89 nips-2002-Feature Selection by Maximum Marginal Diversity

10 0.85282153 39 nips-2002-Bayesian Image Super-Resolution

11 0.85134363 83 nips-2002-Extracting Relevant Structures with Side Information

12 0.85049099 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities

13 0.85010004 52 nips-2002-Cluster Kernels for Semi-Supervised Learning

14 0.84696847 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games

15 0.84569901 162 nips-2002-Parametric Mixture Models for Multi-Labeled Text

16 0.84167725 188 nips-2002-Stability-Based Model Selection

17 0.84092271 27 nips-2002-An Impossibility Theorem for Clustering

18 0.84086543 74 nips-2002-Dynamic Structure Super-Resolution

19 0.84002572 163 nips-2002-Prediction and Semantic Association

20 0.83914399 53 nips-2002-Clustering with the Fisher Score