nips nips2002 nips2002-39 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Michael E. Tipping, Christopher M. Bishop
Abstract: The extraction of a single high-quality image from a set of lowresolution images is an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction of still images from video. Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution image to the observed low resolution images, using regularization to resolve the ill-posed nature of the inversion process. In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parameters is based on a marginalization over the unknown high-resolution image. This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. 1
Reference: text
sentIndex sentText sentNum sentScore
1 com/ "-'{ mtipping,cmbishop} Abstract The extraction of a single high-quality image from a set of lowresolution images is an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction of still images from video. [sent-8, score-0.849]
2 Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution image to the observed low resolution images, using regularization to resolve the ill-posed nature of the inversion process. [sent-9, score-2.122]
3 In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parameters is based on a marginalization over the unknown high-resolution image. [sent-10, score-0.93]
4 This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. [sent-11, score-0.295]
5 Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. [sent-12, score-1.482]
6 1 Introduction The task in super-resolution is to combine a set of low resolution images of the same scene in order to obtain a single image of higher resolution. [sent-13, score-1.354]
7 Provided the individual low resolution images have sub-pixel displacements relative to each other, it is possible to extract high frequency details of the scene well beyond the Nyquist limit of the individual source images. [sent-14, score-1.041]
8 Ideally the low resolution images would differ only through small (sub-pixel) translations, and would be otherwise identical. [sent-15, score-0.86]
9 In practice, the transformations may be more substantial and involve rotations or more complex geometric distortions. [sent-16, score-0.098]
10 In addition the scene itself may change, for instance if the source images are successive frames in a video sequence. [sent-17, score-0.34]
11 Here we focus attention on static scenes in which the transformations relating the source images correspond to translations and rotations, such as can be obtained by taking several images in succession using a hand held digital camera. [sent-18, score-0.539]
12 Our approach is readily extended to more general projective transformations if desired. [sent-19, score-0.051]
13 Larger changes in camera position or orientation can be handled using techniques of robust feature matching, constrained by the epipolar geometry, but such sophistication is unnecessary in the present context. [sent-20, score-0.072]
14 Most previous approaches, for example [1, 2, 3], perform an initial registration of the low resolution images with respect to each other, and then keep this registration fixed. [sent-21, score-1.524]
15 They then formulate probabilistic models of the image generation process, and use maximum likelihood to determine the pixel intensities in the high resolution image. [sent-22, score-1.318]
16 A more convincing approach [4] is to determine simultaneously both the low resolution image registration parameters and the pixel values of the high resolution image, again through maximum likelihood. [sent-23, score-2.273]
17 An obvious difficulty of these techniques is that if the high resolution image has too few pixels then not all of the available high frequency information is extracted from the observed images, whereas if it has too many pixels the maximum likelihood solution becomes ill conditioned. [sent-24, score-1.481]
18 This is typically resolved by the introduction of penalty terms to regularize the maximum likelihood solution, where the regularization coefficients may be set by cross-validation. [sent-25, score-0.104]
19 The regularization terms are often motivated in terms of a prior distribution over the high resolution image, in which case the solution can be interpreted as a MAP (maximum a-posteriori) optimization. [sent-26, score-0.739]
20 Baker and Kanade [5] have tried to improve the performance of super-resolution algorithms by developing domain-specific image priors, applicable to faces or text for example, which are learned from data. [sent-27, score-0.428]
21 In this case the algorithm is effectively hallucinating perceptually plausible high frequency features. [sent-28, score-0.094]
22 Here we focus on general purpose algorithms applicable to any natural image, for which the prior encodes only high level information such as the correlation of nearby pixels. [sent-29, score-0.175]
23 The key development in this paper, which distinguishes it from previous approaches, is the use of Bayesian, rather than simply MAP, techniques by marginalizing over the unknown high resolution image in order to determine the low resolution image registration parameters. [sent-30, score-2.695]
24 Our formulation also allows the choice of continuous values for the up-sampling process, as well the shift and rotation parameters governing the image registration. [sent-31, score-0.738]
25 The generative process by which the high resolution image is smoothed to obtain a low resolution image is described by a point spread function (PSF). [sent-32, score-2.418]
26 It has often been assumed that the point spread function is known in advance, which is unrealistic. [sent-33, score-0.167]
27 Some authors [3] have estimated the PSF in advance using only the low resolution image data, and then kept this estimate fixed while extracting the high resolution image. [sent-34, score-1.838]
28 A key advantage of our Bayesian marginalization is that it allows us to determine the point spread function alongside both the registration parameters and the high resolution image in a single, coherent inference framework. [sent-35, score-1.721]
29 As we show later, if we attempt to determine the PSF as well as the registration parameters and the high resolution image by joint optimization, we obtain highly biased (over-fitted) results. [sent-36, score-1.509]
30 By marginalizing over the unknown high resolution image we are able to determine the PSF and the registration parameters accurately, and thereby reconstruct the high resolution image with subjectively very good quality. [sent-37, score-2.724]
31 2 Bayesian Super-resolution Suppose we are given K low-resolution intensity images (the extension to 3-colour images is straightforward). [sent-38, score-0.361]
32 We shall find it convenient notationally to represent the images as vectors y(k) of length M , where k = 1, . [sent-39, score-0.202]
33 , K, obtained by raster scanning the pixels of the images. [sent-42, score-0.163]
34 Each image is shifted and rotated relative to a reference image which we shall arbitrarily take to be y(1). [sent-43, score-0.909]
35 The shifts are described by 2-dimensional vectors Sk, and the rotations are described by angles Ok. [sent-44, score-0.16]
36 The goal is to infer the underlying scene from which the low resolution images are generated. [sent-45, score-0.945]
37 We represent this scene by a single high-resolution image, which we again denote by a raster-scan vector x whose length is N Âť M. [sent-46, score-0.066]
38 Our approach is based on a generative model for the observed low resolution images, comprising a prior over the high resolution image together with an observation model describing the process by which a low resolution image is obtained from the high resolution one. [sent-47, score-3.699]
39 It should be emphasized that the real scene which we are trying to infer has effectively an infinite resolution, and that its description as a pixellated image is a computational artefact. [sent-48, score-0.513]
40 In particular if we take the number N of pixels in this image to be large the inference algorithm should remain well behaved. [sent-49, score-0.541]
41 This is not the case with maximum likelihood approaches in which the value of N must be limited to avoid ill-conditioning. [sent-50, score-0.075]
42 In our approach, if N is large the correlation of neighbouring pixels is determined primarily by the prior, and the value of N is limited only by the computational cost of working with large numbers of high resolution pixels. [sent-51, score-0.804]
43 We represent the prior over the high resolution image by a Gaussian process p(x) = N(xIO, Zx) (1) where the covariance matrix Zx is chosen to be of the form Zx(i , j) = Aexp {_llvi ~2VjI12}. [sent-52, score-1.173]
44 (2) Here Vi denotes the spatial position in the 2-dimensional image space of pixel i, the coefficient A measures the 'strength' of the prior, and r defines the correlation length scale. [sent-53, score-0.524]
45 Since we take Zx to be a fixed matrix, it is straightforward to use a different functional form for Zx if desired. [sent-54, score-0.047]
46 It should be noted that in our image representation the pixel intensity values lie in the range (-0. [sent-55, score-0.525]
47 5), and so in principle a Gaussian process prior is inappropriate 1 . [sent-57, score-0.075]
48 The low resolution images are assumed to be generated from the high resolution image by first applying a shift and a rotation, then convolving with some point spread function, and finally downsampling to the lower resolution. [sent-59, score-2.253]
49 Here "( represents the 'width' of the point spread function, and we shall treat "( as an unknown parameter to be determined from the data. [sent-69, score-0.252]
50 Note that our approach generalizes readily to any other form of point spread function, possibly containing several unknown parameters, provided it is differentiable with respect to those parameters. [sent-70, score-0.297]
51 In (5) the vector U)k) is the centre of the PSF and is dependent on the shift and rotation of the low resolution image. [sent-71, score-1.036]
52 r, (11) (12) Thus the posterior distribution over the high resolution image is again a Gaussian process. [sent-74, score-1.13]
53 If we knew the registration parameters {Sk' Bk }, as well as the PSF width parameter ,,(, then we could simply take the mean J. [sent-75, score-0.45]
54 Neither approach takes account of the uncertainty in determining the high resolution image and the consequential effects on the optimization of the registration parameters. [sent-79, score-1.486]
55 Here we adopt a Bayesian approach by marginalizing out the unknown high resolution image. [sent-80, score-0.79]
56 This gives the marginal likelihood function for the low resolution images in the form (13) where (14) and y and Ware the vector and matrix of stacked y(k) and W(k) respectively. [sent-81, score-0.954]
57 Using some standard matrix manipulations we can rewrite the marginal likelihood in the form 10gp(YI {Sk' e k }, I ) = 1 -"2 [ ,B 2. [sent-82, score-0.094]
58 (15) We now wish to optimize this marginal likelihood with respect to the parameters {sk,ed'I' and to do this we have compared two approaches. [sent-88, score-0.186]
59 In the E-step we evaluate the posterior distribution over the high resolution image given by (10) . [sent-90, score-1.13]
60 In the M-step we maximize the expectation over x of the log of the complete data likelihood p(y,xl{sk,ed'l) obtained from the product of the prior (1) and the likelihood (8). [sent-91, score-0.175]
61 The second approach is to maximize the marginal likelihood (15) directly using SeG. [sent-93, score-0.094]
62 Empirically we find that direct optimization is faster than EM, and so has been used to obtain the results reported in this paper. [sent-94, score-0.042]
63 Since in (15) we must compute ~, which is N x N, in practice we optimize the shift, rotation and PSF width parameters based on an appropriately-sized subset of the image only. [sent-95, score-0.75]
64 The complete high resolution image is then found as the mode of the full posterior distribution, obtained iteratively by maximizing the numerator in (9), again using SeG optimization. [sent-96, score-1.183]
65 3 Results In order to evaluate our approach we first apply it to a set of low resolution images synthetically down-sampled (by a linear scaling of 4 to 1, or 16 pixels to 1) from a known high-resolution image as follows. [sent-97, score-1.422]
66 Finally we determine the value at each pixel of the low resolution image by convolution of the original image with the point spread function (centred on the low resolution pixel), with width parameter 1 = 2. [sent-99, score-2.599]
67 From a high-resolution image of 384 x 256 we chose to use a set of 16 images of resolution 96 x 64. [sent-101, score-1.174]
68 In order to limit the computational cost we use patches from the centre of the low resolution image of size 9 x 9 in order to determine the values of the shift, rotation and PSF width parameters. [sent-102, score-1.456]
69 We set the resolution of the super-resolved image to have 16 times as many pixels as the low resolution images which, allowing for shifts and the support of the point spread function, gives N = 50 x 50. [sent-103, score-2.215]
70 The Gaussian process prior is chosen to have width parameter r = 1. [sent-104, score-0.156]
71 The scaled conjugate gradient optimization is initialised by setting the shift and rotation parameters equal to zero, while the PSF width "( is initialized to 4. [sent-109, score-0.455]
72 0 since this is the upsampling factor we have chosen between low resolution and superresolved images. [sent-110, score-0.711]
73 We first optimize only the shifts, then we optimize both shifts and rotations, and finally we optimize shifts, rotations and PSF width, in each case running until a suitable convergence tolerance is reached. [sent-111, score-0.28]
74 In Figure l(a) we show the original image, together with an example low resolution image in Figure l(b). [sent-112, score-1.137]
75 Figure l(c) shows the super-resolved image obtained using our Bayesian approach. [sent-113, score-0.457]
76 We see that the super-resolved image is of dramatically better quality than the low resolution images from which it is inferred. [sent-114, score-1.288]
77 The converged value for the PSF width parameter is "( = 1. [sent-115, score-0.113]
78 Figure 1: Example using synthetically generated data showing (top left) the original image, (top right) an example low resolution image and (bottom left) the inferred super-resolved image. [sent-118, score-1.16]
79 Also shown, in (bottom right), is a comparison super-resolved image obtained by joint optimization with respect to the super-resolved image and the parameters, demonstrating the significanly poorer result. [sent-119, score-1.013]
80 Notice that there are some small edge effects in the super-resolved image arising from the fact that these pixels only receive evidence from a subset of the low resolution images due to the image shifts. [sent-120, score-1.808]
81 Thus pixels near the edge of the high resolution image are determined primarily by the prior. [sent-121, score-1.212]
82 For comparison we show, in Figure l(d), the corresponding super-resolved image obtained by performing a MAP optimization with respect to the high resolution image. [sent-122, score-1.193]
83 This is of significantly poorer quality than that obtained from our Bayesian approach. [sent-123, score-0.094]
84 The converged value for t he PSF width in this case is '"Y = 0. [sent-124, score-0.113]
85 In Figure 2 we show plots of the true and estimated values for the shift and rotation parameters using our Bayesian approach and also using MAP optimization. [sent-126, score-0.31]
86 Again we see the severe over-fitting resulting from joint optimization, and the significantly better results obtained from the Bayesian approach. [sent-127, score-0.104]
87 5 '---~-~--~--~-~----' -2 -1 0 1 horizontal shift 2 ~ 0. [sent-145, score-0.128]
88 (b) Comparison of the errors in determining the rotation parameters for both Bayesian and MAP approaches. [sent-191, score-0.208]
89 Finally, we apply our technique to a set of images obtained by taking 16 frames using a hand held digital camera in 'multi-shot' mode (press and hold the shutter release) which takes about 12 seconds. [sent-192, score-0.368]
90 An example image, together with the super-resolved image obtained using our Bayesian algorithm, is shown in Figure 3. [sent-193, score-0.476]
91 4 Discussion In this paper we have proposed a new approach to the problem of image superresolution, based on a marginalization over the unknown high resolution image using a Gaussian process prior. [sent-194, score-1.662]
92 Our results demonstrate a worthwhile improvement over previous approaches based on MAP estimation, including the ability to estimate parameters of the point spread function. [sent-195, score-0.195]
93 One potential application our technique is the extraction of high resolution images from video sequences. [sent-196, score-0.896]
94 In this case it will be necessary to take account of motion blur, as well as the registration, for example by tracking moving objects through the successive frames [7]. [sent-197, score-0.098]
95 (a) Low-resolution image (1 of 16) (b) 4x Super-resolved image (Bayesian) Figure 3: Application to real data showing in (a) one of the 16 captured in succession usind a hand held camera of a doorway with nearby printed sign. [sent-198, score-1.0]
96 Image (b) shows the final image obtained from our Bayesian super-resolution algorithm. [sent-199, score-0.457]
97 Finally, having seen the advantages of marginalizing with respect to the high resolution image, we can extend this approach to a fully Bayesian one based on Markov chain Monte Carlo sampling over all unknown parameters in the model. [sent-200, score-0.842]
98 Since our model is differentiable with respect to these parameters, this can be done efficiently using the hybrid Monte Carlo algorithm. [sent-201, score-0.056]
99 This approach would allow the use of a prior distribution over high resolution pixel intensities which was confined to a bounded interval, instead ofthe Gaussian assumed in this paper. [sent-202, score-0.816]
100 Joint MAP registration and high-resolution image estimation using a sequence of undersampled images. [sent-231, score-0.748]
wordName wordTfidf (topN-words)
[('resolution', 0.576), ('image', 0.428), ('registration', 0.32), ('psf', 0.288), ('images', 0.17), ('rotation', 0.154), ('spread', 0.145), ('shift', 0.128), ('zx', 0.12), ('low', 0.114), ('high', 0.094), ('shifts', 0.092), ('pixels', 0.092), ('bayesian', 0.089), ('width', 0.081), ('pixel', 0.076), ('rotations', 0.068), ('marginalizing', 0.067), ('scene', 0.066), ('centre', 0.064), ('sk', 0.056), ('likelihood', 0.053), ('unknown', 0.053), ('camera', 0.051), ('seg', 0.048), ('marginalization', 0.048), ('map', 0.043), ('superresolution', 0.042), ('synthetically', 0.042), ('optimization', 0.042), ('marginal', 0.041), ('prior', 0.04), ('optimize', 0.04), ('determine', 0.039), ('held', 0.038), ('poorer', 0.038), ('process', 0.035), ('frames', 0.035), ('succession', 0.034), ('shall', 0.032), ('posterior', 0.032), ('converged', 0.032), ('differentiable', 0.032), ('difficulty', 0.03), ('inversion', 0.03), ('intensities', 0.03), ('extraction', 0.03), ('transformations', 0.03), ('baker', 0.029), ('obtained', 0.029), ('regularization', 0.029), ('parameters', 0.028), ('significantly', 0.027), ('jl', 0.027), ('gaussian', 0.027), ('translations', 0.026), ('determining', 0.026), ('fixed', 0.026), ('video', 0.026), ('interval', 0.025), ('mode', 0.024), ('advance', 0.024), ('respect', 0.024), ('joint', 0.024), ('severe', 0.024), ('successive', 0.022), ('point', 0.022), ('transformation', 0.022), ('maximum', 0.022), ('conjugate', 0.022), ('primarily', 0.022), ('readily', 0.021), ('take', 0.021), ('transactions', 0.021), ('digital', 0.021), ('barnard', 0.021), ('subjectively', 0.021), ('alongside', 0.021), ('cmbishop', 0.021), ('epipolar', 0.021), ('ily', 0.021), ('lowresolution', 0.021), ('netlab', 0.021), ('raster', 0.021), ('scanning', 0.021), ('sensibly', 0.021), ('tz', 0.021), ('upsampling', 0.021), ('xio', 0.021), ('intensity', 0.021), ('nearby', 0.021), ('source', 0.021), ('correlation', 0.02), ('reconstruction', 0.02), ('motion', 0.02), ('practice', 0.019), ('carlo', 0.019), ('infer', 0.019), ('together', 0.019), ('kanade', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 39 nips-2002-Bayesian Image Super-Resolution
Author: Michael E. Tipping, Christopher M. Bishop
Abstract: The extraction of a single high-quality image from a set of lowresolution images is an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction of still images from video. Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution image to the observed low resolution images, using regularization to resolve the ill-posed nature of the inversion process. In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parameters is based on a marginalization over the unknown high-resolution image. This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. 1
2 0.47034681 74 nips-2002-Dynamic Structure Super-Resolution
Author: Amos J. Storkey
Abstract: The problem of super-resolution involves generating feasible higher resolution images, which are pleasing to the eye and realistic, from a given low resolution image. This might be attempted by using simple filters for smoothing out the high resolution blocks or through applications where substantial prior information is used to imply the textures and shapes which will occur in the images. In this paper we describe an approach which lies between the two extremes. It is a generic unsupervised method which is usable in all domains, but goes beyond simple smoothing methods in what it achieves. We use a dynamic tree-like architecture to model the high resolution data. Approximate conditioning on the low resolution image is achieved through a mean field approach. 1
3 0.21495171 10 nips-2002-A Model for Learning Variance Components of Natural Images
Author: Yan Karklin, Michael S. Lewicki
Abstract: We present a hierarchical Bayesian model for learning efficient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefficients with a distribution that is adapted to the variance structure of the coefficients of an efficient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.
4 0.16524257 202 nips-2002-Unsupervised Color Constancy
Author: Kinh Tieu, Erik G. Miller
Abstract: In [1] we introduced a linear statistical model of joint color changes in images due to variation in lighting and certain non-geometric camera parameters. We did this by measuring the mappings of colors in one image of a scene to colors in another image of the same scene under different lighting conditions. Here we increase the flexibility of this color flow model by allowing flow coefficients to vary according to a low order polynomial over the image. This allows us to better fit smoothly varying lighting conditions as well as curved surfaces without endowing our model with too much capacity. We show results on image matching and shadow removal and detection.
5 0.15442429 173 nips-2002-Recovering Intrinsic Images from a Single Image
Author: Marshall F. Tappen, William T. Freeman, Edward H. Adelson
Abstract: We present an algorithm that uses multiple cues to recover shading and reflectance intrinsic images from a single image. Using both color information and a classifier trained to recognize gray-scale patterns, each image derivative is classified as being caused by shading or a change in the surface’s reflectance. Generalized Belief Propagation is then used to propagate information from areas where the correct classification is clear to areas where it is ambiguous. We also show results on real images.
6 0.15347065 87 nips-2002-Fast Transformation-Invariant Factor Analysis
7 0.13804743 182 nips-2002-Shape Recipes: Scene Representations that Refer to the Image
8 0.12886171 126 nips-2002-Learning Sparse Multiscale Image Representations
9 0.12632796 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
10 0.11518933 2 nips-2002-A Bilinear Model for Sparse Coding
11 0.11317375 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
12 0.10903202 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes
13 0.10045505 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
14 0.096225083 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
15 0.091145374 136 nips-2002-Linear Combinations of Optic Flow Vectors for Estimating Self-Motion - a Real-World Test of a Neural Model
16 0.090690196 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
17 0.078143485 118 nips-2002-Kernel-Based Extraction of Slow Features: Complex Cells Learn Disparity and Translation Invariance from Natural Images
18 0.077146798 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
19 0.071192451 137 nips-2002-Location Estimation with a Differential Update Network
20 0.067246906 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
topicId topicWeight
[(0, -0.193), (1, 0.013), (2, -0.028), (3, 0.424), (4, -0.03), (5, -0.02), (6, 0.217), (7, 0.015), (8, 0.009), (9, -0.048), (10, 0.072), (11, -0.091), (12, 0.069), (13, -0.01), (14, 0.043), (15, -0.055), (16, 0.065), (17, 0.081), (18, 0.073), (19, 0.151), (20, -0.076), (21, 0.044), (22, -0.119), (23, 0.094), (24, 0.124), (25, 0.266), (26, 0.031), (27, -0.068), (28, -0.115), (29, 0.083), (30, 0.114), (31, -0.085), (32, 0.067), (33, -0.051), (34, 0.162), (35, 0.152), (36, -0.065), (37, -0.109), (38, 0.089), (39, 0.047), (40, -0.006), (41, 0.061), (42, 0.116), (43, 0.042), (44, 0.129), (45, 0.005), (46, -0.025), (47, 0.038), (48, 0.02), (49, 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 0.98492706 39 nips-2002-Bayesian Image Super-Resolution
Author: Michael E. Tipping, Christopher M. Bishop
Abstract: The extraction of a single high-quality image from a set of lowresolution images is an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction of still images from video. Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution image to the observed low resolution images, using regularization to resolve the ill-posed nature of the inversion process. In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parameters is based on a marginalization over the unknown high-resolution image. This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. 1
2 0.93275887 74 nips-2002-Dynamic Structure Super-Resolution
Author: Amos J. Storkey
Abstract: The problem of super-resolution involves generating feasible higher resolution images, which are pleasing to the eye and realistic, from a given low resolution image. This might be attempted by using simple filters for smoothing out the high resolution blocks or through applications where substantial prior information is used to imply the textures and shapes which will occur in the images. In this paper we describe an approach which lies between the two extremes. It is a generic unsupervised method which is usable in all domains, but goes beyond simple smoothing methods in what it achieves. We use a dynamic tree-like architecture to model the high resolution data. Approximate conditioning on the low resolution image is achieved through a mean field approach. 1
3 0.62250125 182 nips-2002-Shape Recipes: Scene Representations that Refer to the Image
Author: William T. Freeman, Antonio Torralba
Abstract: The goal of low-level vision is to estimate an underlying scene, given an observed image. Real-world scenes (eg, albedos or shapes) can be very complex, conventionally requiring high dimensional representations which are hard to estimate and store. We propose a low-dimensional representation, called a scene recipe, that relies on the image itself to describe the complex scene configurations. Shape recipes are an example: these are the regression coefficients that predict the bandpassed shape from image data. We describe the benefits of this representation, and show two uses illustrating their properties: (1) we improve stereo shape estimates by learning shape recipes at low resolution and applying them at full resolution; (2) Shape recipes implicitly contain information about lighting and materials and we use them for material segmentation.
4 0.55718499 87 nips-2002-Fast Transformation-Invariant Factor Analysis
Author: Anitha Kannan, Nebojsa Jojic, Brendan J. Frey
Abstract: Dimensionality reduction techniques such as principal component analysis and factor analysis are used to discover a linear mapping between high dimensional data samples and points in a lower dimensional subspace. In [6], Jojic and Frey introduced mixture of transformation-invariant component analyzers (MTCA) that can account for global transformations such as translations and rotations, perform clustering and learn local appearance deformations by dimensionality reduction. However, due to enormous computational requirements of the EM algorithm for learning the model, O( ) where is the dimensionality of a data sample, MTCA was not practical for most applications. In this paper, we demonstrate how fast Fourier transforms can reduce the computation to the order of log . With this speedup, we show the effectiveness of MTCA in various applications - tracking, video textures, clustering video sequences, object recognition, and object detection in images. ¡ ¤ ¤ ¤ ¤
5 0.52580172 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes
Author: Anat Levin, Assaf Zomet, Yair Weiss
Abstract: Certain simple images are known to trigger a percept of transparency: the input image I is perceived as the sum of two images I(x, y) = I1 (x, y) + I2 (x, y). This percept is puzzling. First, why do we choose the “more complicated” description with two images rather than the “simpler” explanation I(x, y) = I1 (x, y) + 0 ? Second, given the infinite number of ways to express I as a sum of two images, how do we compute the “best” decomposition ? Here we suggest that transparency is the rational percept of a system that is adapted to the statistics of natural scenes. We present a probabilistic model of images based on the qualitative statistics of derivative filters and “corner detectors” in natural scenes and use this model to find the most probable decomposition of a novel image. The optimization is performed using loopy belief propagation. We show that our model computes perceptually “correct” decompositions on synthetic images and discuss its application to real images. 1
6 0.52341866 173 nips-2002-Recovering Intrinsic Images from a Single Image
7 0.44060805 150 nips-2002-Multiple Cause Vector Quantization
8 0.43789533 202 nips-2002-Unsupervised Color Constancy
9 0.43065235 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
10 0.42749143 126 nips-2002-Learning Sparse Multiscale Image Representations
11 0.39555252 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
12 0.39161289 10 nips-2002-A Model for Learning Variance Components of Natural Images
13 0.38202927 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
14 0.33782929 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
15 0.33728153 16 nips-2002-A Prototype for Automatic Recognition of Spontaneous Facial Actions
16 0.31206456 2 nips-2002-A Bilinear Model for Sparse Coding
17 0.28261146 136 nips-2002-Linear Combinations of Optic Flow Vectors for Estimating Self-Motion - a Real-World Test of a Neural Model
18 0.24309003 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
19 0.23859142 177 nips-2002-Retinal Processing Emulation in a Programmable 2-Layer Analog Array Processor CMOS Chip
20 0.23710012 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
topicId topicWeight
[(1, 0.02), (3, 0.011), (11, 0.024), (23, 0.018), (31, 0.249), (42, 0.064), (54, 0.134), (55, 0.048), (57, 0.016), (67, 0.015), (68, 0.027), (74, 0.159), (92, 0.026), (98, 0.097)]
simIndex simValue paperId paperTitle
1 0.89203817 131 nips-2002-Learning to Classify Galaxy Shapes Using the EM Algorithm
Author: Sergey Kirshner, Igor V. Cadez, Padhraic Smyth, Chandrika Kamath
Abstract: We describe the application of probabilistic model-based learning to the problem of automatically identifying classes of galaxies, based on both morphological and pixel intensity characteristics. The EM algorithm can be used to learn how to spatially orient a set of galaxies so that they are geometrically aligned. We augment this “ordering-model” with a mixture model on objects, and demonstrate how classes of galaxies can be learned in an unsupervised manner using a two-level EM algorithm. The resulting models provide highly accurate classi£cation of galaxies in cross-validation experiments. 1 Introduction and Background The £eld of astronomy is increasingly data-driven as new observing instruments permit the rapid collection of massive archives of sky image data. In this paper we investigate the problem of identifying bent-double radio galaxies in the FIRST (Faint Images of the Radio Sky at Twenty-cm) Survey data set [1]. FIRST produces large numbers of radio images of the deep sky using the Very Large Array at the National Radio Astronomy Observatory. It is scheduled to cover more that 10,000 square degrees of the northern and southern caps (skies). Of particular scienti£c interest to astronomers is the identi£cation and cataloging of sky objects with a “bent-double” morphology, indicating clusters of galaxies ([8], see Figure 1). Due to the very large number of observed deep-sky radio sources, (on the order of 106 so far) it is infeasible for the astronomers to label all of them manually. The data from the FIRST Survey (http://sundog.stsci.edu/) is available in both raw image format and in the form of a catalog of features that have been automatically derived from the raw images by an image analysis program [8]. Each entry corresponds to a single detectable “blob” of bright intensity relative to the sky background: these entries are called Figure 1: 4 examples of radio-source galaxy images. The two on the left are labelled as “bent-doubles” and the two on the right are not. The con£gurations on the left have more “bend” and symmetry than the two non-bent-doubles on the right. components. The “blob” of intensities for each component is £tted with an ellipse. The ellipses and intensities for each component are described by a set of estimated features such as sky position of the centers (RA (right ascension) and Dec (declination)), peak density ¤ux and integrated ¤ux, root mean square noise in pixel intensities, lengths of the major and minor axes, and the position angle of the major axis of the ellipse counterclockwise from the north. The goal is to £nd sets of components that are spatially close and that resemble a bent-double. In the results in this paper we focus on candidate sets of components that have been detected by an existing spatial clustering algorithm [3] where each set consists of three components from the catalog (three ellipses). As of the year 2000, the catalog contained over 15,000 three-component con£gurations and over 600,000 con£gurations total. The set which we use to build and evaluate our models consists of a total of 128 examples of bent-double galaxies and 22 examples of non-bent-double con£gurations. A con£guration is labelled as a bent-double if two out of three astronomers agree to label it as such. Note that the visual identi£cation process is the bottleneck in the process since it requires signi£cant time and effort from the scientists, and is subjective and error-prone, motivating the creation of automated methods for identifying bent-doubles. Three-component bent-double con£gurations typically consist of a center or “core” component and two other side components called “lobes”. Previous work on automated classi£cation of three-component candidate sets has focused on the use of decision-tree classi£ers using a variety of geometric and image intensity features [3]. One of the limitations of the decision-tree approach is its relative in¤exibility in handling uncertainty about the object being classi£ed, e.g., the identi£cation of which of the three components should be treated as the core of a candidate object. A bigger limitation is the £xed size of the feature vector. A primary motivation for the development of a probabilistic approach is to provide a framework that can handle uncertainties in a ¤exible coherent manner. 2 Learning to Match Orderings using the EM Algorithm We denote a three-component con£guration by C = (c 1 , c2 , c3 ), where the ci ’s are the components (or “blobs”) described in the previous section. Each component c x is represented as a feature vector, where the speci£c features will be de£ned later. Our approach focuses on building a probabilistic model for bent-doubles: p (C) = p (c1 , c2 , c3 ), the likelihood of the observed ci under a bent-double model where we implicitly condition (for now) on the class “bent-double.” By looking at examples of bent-double galaxies and by talking to the scientists studying them, we have been able to establish a number of potentially useful characteristics of the components, the primary one being geometric symmetry. In bent-doubles, two of the components will look close to being mirror images of one another with respect to a line through the third component. We will call mirror-image components lobe compo- core core 1 lobe 2 2 3 lobe 2 lobe 1 lobe 1 component 2 lobe 1 component 3 lobe 2 lobe 1 4 core lobe 1 5 lobe 2 lobe 2 6 core core component 1 core lobe 2 lobe 1 Figure 2: Possible orderings for a hypothetical bent-double. A good choice of ordering would be either 1 or 2. nents, and the other one the core component. It also appears that non-bent-doubles either don’t exhibit such symmetry, or the angle formed at the core component is too straight— the con£guration is not “bent” enough. Once the core component is identi£ed, we can calculate symmetry-based features. However, identifying the most plausible core component requires either an additional algorithm or human expertise. In our approach we use a probabilistic framework that averages over different possible orderings weighted by their probability given the data. In order to de£ne the features, we £rst need to determine the mapping of the components to labels “core”, “lobe 1”, and “lobe 2” (c, l1 , and l2 for short). We will call such a mapping an ordering. Figure 2 shows an example of possible orderings for a con£guration. We can number the orderings 1, . . . , 6. We can then write 6 p (C) = p (cc , cl1 , cl2 |Ω = k) p (Ω = k) , (1) k=1 i.e., a mixture over all possible orientations. Each ordering is assumed a priori to be equally 1 likely, i.e., p(Ω = k) = 6 . Intuitively, for a con£guration that clearly looks like a bentdouble the terms in the mixture corresponding to the correct ordering would dominate, while the other orderings would have much lower probability. We represent each component cx by M features (we used M = 3). Note that the features can only be calculated conditioned on a particular mapping since they rely on properties of the (assumed) core and lobe components. We denote by fmk (C) the values corresponding to the mth feature for con£guration C under the ordering Ω = k, and by f mkj (C) we denote the feature value of component j: fmk (C) = (fmk1 (C) , . . . , fmkBm (C)) (in our case, Bm = 3 is the number of components). Conditioned on a particular mapping Ω = k, where x ∈ {c, l1 , l2 } and c,l1 ,l2 are de£ned in a cyclical order, our features are de£ned as: • f1k (C) : Log-transformed angle, the angle formed at the center of the component (a vertex of the con£guration) mapped to label x; |center of x to center of next(x)| • f2k (C) : Logarithms of side ratios, center of x to center of prev(x) ; | | peak ¤ux of next(x) • f3k (C) : Logarithms of intensity ratios, peak ¤ux of prev(x) , and so (C|Ω = k) = (f1k (C) , f2k (C) f3k (C)) for a 9-dimensional feature vector in total. Other features are of course also possible. For our purposes in this paper this particular set appears to capture the more obvious visual properties of bent-double galaxies. For a set D = {d1 , . . . , dN } of con£gurations, under an i.i.d. assumption for con£gurations, we can write the likelihood as N K P (D) = P (Ωi = k) P (f1k (di ) , . . . , fM k (di )) , i=1 k=1 where Ωi is the ordering for con£guration d i . While in the general case one can model P (f1k (di ) , . . . , fM k (di )) as a full joint distribution, for the results reported in this paper we make a number of simplifying assumptions, motivated by the fact that we have relatively little labelled training data available for model building. First, we assume that the fmk (di ) are conditionally independent. Second, we are also able to reduce the number of components for each fmk (di ) by noting functional dependencies. For example, given two angles of a triangle, we can uniquely determine the third one. We also assume that the remaining components for each feature are conditionally independent. Under these assumptions the multivariate joint distribution P (f1k (di ) , . . . , fM k (di )) is factored into a product of simple distributions, which (for the purposes of this paper) we model using Gaussians. If we know for every training example which component should be mapped to label c, we can then unambiguously estimate the parameters for each of these distributions. In practice, however, the identity of the core component is unknown for each object. Thus, we use the EM algorithm to automatically estimate the parameters of the above model. We begin by randomly assigning an ordering to each object. For each subsequent iteration the E-step consists of estimating a probability distribution over possible orderings for each object, and the M-step estimates the parameters of the feature-distributions using the probabilistic ordering information from the E-step. In practice we have found that the algorithm converges relatively quickly (in 20 to 30 iterations) on both simulated and real data. It is somewhat surprising that this algorithm can reliably “learn” how to align a set of objects, without using any explicit objective function for alignment, but instead based on the fact that feature values for certain orderings exhibit a certain self-consistency relative to the model. Intuitively it is this self-consistency that leads to higher-likelihood solutions and that allows EM to effectively align the objects by maximizing the likelihood. After the model has been estimated, the likelihood of new objects can also be calculated under the model, where the likelihood now averages over all possible orderings weighted by their probability given the observed features. The problem described above is a speci£c instance of a more general feature unscrambling problem. In our case, we assume that con£gurations of three 3-dimensional components (i.e. 3 features) each are generated by some distribution. Once the objects are generated, the orders of their components are permuted or scrambled. The task is then to simultaneously learn the parameters of the original distributions and the scrambling for each object. In the more general form, each con£guration consists of L M -dimensional con£gurations. Since there are L! possible orderings of L components, the problem becomes computationally intractable if L is large. One solution is to restrict the types of possible scrambles (to cyclic shifts for example). 3 Automatic Galaxy Classi£cation We used the algorithm described in the previous section to estimate the parameters of features and orderings of the bent-double class from labelled training data and then to rank candidate objects according to their likelihood under the model. We used leave-one-out cross-validation to test the classi£cation ability of this supervised model, where for each of the 150 examples we build a model using the positive examples from the set of 149 “other” examples, and then score the “left-out” example with this model. The examples are then sorted in decreasing order by their likelihood score (averaging over different possi- 1 0.9 True positive rate 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False positive rate Figure 3: ROC plot for a model using angle, ratio of sides, and ratio of intensities, as features, and learned using ordering-EM with labelled data. ble orderings) and the results are analyzed using a receiver operating characteristic (ROC) methodology. We use AROC , the area under the curve, as a measure of goodness of the model, where a perfect model would have AROC = 1 and random performance corresponds to AROC = 0.5. The supervised model, using EM for learning ordering models, has a cross-validated AROC score of 0.9336 (Figure 3) and appears to be quite useful at detecting bent-double galaxies. 4 Model-Based Galaxy Clustering A useful technique in understanding astronomical image data is to cluster image objects based on their morphological and intensity properties. For example, consider how one might cluster the image objects in Figure 1 into clusters, where we have features on angles, intensities, and so forth. Just as with classi£cation, clustering of the objects is impeded by not knowing which of the “blobs” corresponds to the true “core” component. From a probabilistic viewpoint, clustering can be treated as introducing another level of hidden variables, namely the unknown class (or cluster) identity of each object. We can generalize the EM algorithm for orderings (Section 2) to handle this additional hidden level. The model is now a mixture of clusters where each cluster is modelled as a mixture of orderings. This leads to a more complex two-level EM algorithm than that presented in Section 2, where at the inner-level the algorithm is learning how to orient the objects, and at the outer level the algorithm is learning how to group the objects into C classes. Space does not permit a detailed presentation of this algorithm—however, the derivation is straightforward and produces intuitive update rules such as: µcmj = ˆ 1 ˆ N P (cl = c|Θ) N K P (cli = c|Ωi = k, D, Θ) P (Ωi = k|D, Θ) fmkj (di ) i=1 k=1 where µcmj is the mean for the cth cluster (1 ≤ c ≤ C), the mth feature (1 ≤ m ≤ M ), and the jth component of fmk (di ), and Ωi = k corresponds to ordering k for the ith object. We applied this algorithm to the data set of 150 sky objects, where unlike the results in Section 3, the algorithm now had no access to the class labels. We used the Gaussian conditional-independence model as before, and grouped the data into K = 2 clusters. Figures 4 and 5 show the highest likelihood objects, out of 150 total objects, under the Bent−double Bent−double Bent−double Bent−double Bent−double Bent−double Bent−double Bent−double Figure 4: The 8 objects with the highest likelihood conditioned on the model for the larger of the two clusters learned by the unsupervised algorithm. Bent−double Non−bent−double Non−bent−double Non−bent−double Non−bent−double Non−bent−double Bent−double Non−bent−double Figure 5: The 8 objects with the highest likelihood conditioned on the model for the smaller of the two clusters learned by the unsupervised algorithm. 150 Unsupervised Rank bent−doubles non−bent−doubles 100 50 0 0 50 100 150 Supervised Rank Figure 6: A scatter plot of the ranking from the unsupervised model versus that of the supervised model. models for the larger cluster and smaller cluster respectively. The larger cluster is clearly a bent-double cluster: 89 of the 150 objects are more likely to belong to this cluster under the model and 88 out of the 89 objects in this cluster have the bent-double label. In other words, the unsupervised algorithm has discovered a cluster that corresponds to “strong examples” of bent-doubles relative to the particular feature-space and model. In fact the non-bentdouble that is assigned to this group may well have been mislabelled (image not shown here). The objects in Figure 5 are clearly inconsistent with the general visual pattern of bent-doubles and this cluster consists of a mixture of non-bent-double and “weaker” bentdouble galaxies. The objects in Figures 5 that are labelled as bent-doubles seem quite atypical compared to the bent-doubles in Figure 4. A natural hypothesis is that cluster 1 (88 bent-doubles) in the unsupervised model is in fact very similar to the supervised model learned using the labelled set of 128 bent-doubles in Section 3. Indeed the parameters of the two Gaussian models agree quite closely and the similarity of the two models is illustrated clearly in Figure 6 where we plot the likelihoodbased ranks of the unsupervised model versus those of the supervised model. Both models are in close agreement and both are clearly performing well in terms of separating the objects in terms of their class labels. 5 Related Work and Future Directions A related earlier paper is Kirshner et al [6] where we presented a heuristic algorithm for solving the orientation problem for galaxies. The generalization to an EM framework in this paper is new, as is the two-level EM algorithm for clustering objects in an unsupervised manner. There is a substantial body of work in computer vision on solving a variety of different object matching problems using probabilistic techniques—see Mjolsness [7] for early ideas and Chui et al. [2] for a recent application in medical imaging. Our work here differs in a number of respects. One important difference is that we use EM to learn a model for the simultaneous correspondence of N objects, using both geometric and intensity-based features, whereas prior work in vision has primarily focused on matching one object to another (essentially the N = 2 case). An exception is the recent work of Frey and Jojic [4, 5] who used a similar EM-based approach to simultaneously cluster images and estimate a variety of local spatial deformations. The work described in this paper can be viewed as an extension and application of this general methodology to a real-world problem in galaxy classi£cation. Earlier work on bent-double galaxy classi£cation used decision tree classi£ers based on a variety of geometric and intensity-based features [3]. In future work we plan to compare the performance of this decision tree approach with the probabilistic model-based approach proposed in this paper. The model-based approach has some inherent advantages over a decision-tree model for these types of problems. For example, it can directly handle objects in the catalog with only 2 blobs or with 4 or more blobs by integrating over missing intensities and over missing correspondence information using mixture models that allow for missing or extra “blobs”. Being able to classify such con£gurations automatically is of signi£cant interest to the astronomers. Acknowledgments This work was performed under a sub-contract from the ASCI Scienti£c Data Management Project of the Lawrence Livermore National Laboratory. The work of S. Kirshner and P. Smyth was also supported by research grants from NSF (award IRI-9703120), the Jet Propulsion Laboratory, IBM Research, and Microsoft Research. I. Cadez was supported by a Microsoft Graduate Fellowship. The work of C. Kamath was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48. We gratefully acknowledge our FIRST collaborators, in particular, Robert H. Becker for sharing his expertise on the subject. References [1] R. H. Becker, R. L. White, and D. J. Helfand. The FIRST Survey: Faint Images of the Radio Sky at Twenty-cm. Astrophysical Journal, 450:559, 1995. [2] H. Chui, L. Win, R. Schultz, J. S. Duncan, and A. Rangarajan. A uni£ed feature registration method for brain mapping. In Proceedings of Information Processing in Medical Imaging, pages 300–314. Springer-Verlag, 2001. [3] I. K. Fodor, E. Cant´ -Paz, C. Kamath, and N. A. Tang. Finding bent-double radio u galaxies: A case study in data mining. In Proceedings of the Interface: Computer Science and Statistics Symposium, volume 33, 2000. [4] B. J. Frey and N. Jojic. Estimating mixture models of images and inferring spatial transformations using the EM algorithm. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999. [5] N. Jojic and B. J. Frey. Topographic transformation as a discrete latent variable. In Advances in Neural Information Processing Systems 12. MIT Press, 2000. ´ [6] S. Kirshner, I. V. Cadez, P. Smyth, C. Kamath, and E. Cantu-Paz. Probabilistic modelbased detection of bent-double radio galaxies. In Proceedings 16th International Conference on Pattern Recognition, volume 2, pages 499–502, 2002. [7] E. Mjolsness. Bayesian inference on visual grammars by neural networks that optimize. Technical Report YALEU/DCS/TR-854, Department of Computer Science, Yale University, May 1991. [8] R. L. White, R. H. Becker, D. J. Helfand, and M. D. Gregg. A catalog of 1.4 GHz radio sources from the FIRST Survey. Astrophysical Journal, 475:479, 1997.
same-paper 2 0.8538298 39 nips-2002-Bayesian Image Super-Resolution
Author: Michael E. Tipping, Christopher M. Bishop
Abstract: The extraction of a single high-quality image from a set of lowresolution images is an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction of still images from video. Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution image to the observed low resolution images, using regularization to resolve the ill-posed nature of the inversion process. In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parameters is based on a marginalization over the unknown high-resolution image. This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. 1
3 0.67652619 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
Author: David R. Martin, Charless C. Fowlkes, Jitendra Malik
Abstract: The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, a classifier is trained using human labeled images as ground truth. We present precision-recall curves showing that the resulting detector outperforms existing approaches.
4 0.67609417 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Author: Xiaofeng Wang, Tuomas Sandholm
Abstract: Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play. In this paper, we present optimal adaptive learning, the first algorithm that converges to an optimal Nash equilibrium with probability 1 in any team Markov game. We provide a convergence proof, and show that the algorithm’s parameters are easy to set to meet the convergence conditions.
5 0.67584801 2 nips-2002-A Bilinear Model for Sparse Coding
Author: David B. Grimes, Rajesh P. Rao
Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.
6 0.67572021 162 nips-2002-Parametric Mixture Models for Multi-Labeled Text
7 0.67298865 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
8 0.67219102 89 nips-2002-Feature Selection by Maximum Marginal Diversity
9 0.67084551 52 nips-2002-Cluster Kernels for Semi-Supervised Learning
10 0.67071748 74 nips-2002-Dynamic Structure Super-Resolution
11 0.66962636 152 nips-2002-Nash Propagation for Loopy Graphical Games
12 0.66908878 124 nips-2002-Learning Graphical Models with Mercer Kernels
13 0.66503441 135 nips-2002-Learning with Multiple Labels
14 0.66362828 173 nips-2002-Recovering Intrinsic Images from a Single Image
15 0.6627636 53 nips-2002-Clustering with the Fisher Score
16 0.66145664 48 nips-2002-Categorization Under Complexity: A Unified MDL Account of Human Learning of Regular and Irregular Categories
17 0.65947849 10 nips-2002-A Model for Learning Variance Components of Natural Images
18 0.65734482 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
19 0.65722346 163 nips-2002-Prediction and Semantic Association
20 0.65668654 3 nips-2002-A Convergent Form of Approximate Policy Iteration