nips nips2002 nips2002-74 knowledge-graph by maker-knowledge-mining

74 nips-2002-Dynamic Structure Super-Resolution

Source: pdf

Author: Amos J. Storkey

Abstract: The problem of super-resolution involves generating feasible higher resolution images, which are pleasing to the eye and realistic, from a given low resolution image. This might be attempted by using simple ﬁlters for smoothing out the high resolution blocks or through applications where substantial prior information is used to imply the textures and shapes which will occur in the images. In this paper we describe an approach which lies between the two extremes. It is a generic unsupervised method which is usable in all domains, but goes beyond simple smoothing methods in what it achieves. We use a dynamic tree-like architecture to model the high resolution data. Approximate conditioning on the low resolution image is achieved through a mean ﬁeld approach. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 uk Abstract The problem of super-resolution involves generating feasible higher resolution images, which are pleasing to the eye and realistic, from a given low resolution image. [sent-4, score-1.141]

2 This might be attempted by using simple ﬁlters for smoothing out the high resolution blocks or through applications where substantial prior information is used to imply the textures and shapes which will occur in the images. [sent-5, score-0.667]

3 We use a dynamic tree-like architecture to model the high resolution data. [sent-8, score-0.657]

4 Approximate conditioning on the low resolution image is achieved through a mean ﬁeld approach. [sent-9, score-0.842]

5 1 Introduction Good techniques for super-resolution are especially useful where physical limitations exist preventing higher resolution images from being obtained. [sent-10, score-0.559]

6 For example, in astronomy where public presentation of images is of signiﬁcant importance, superresolution techniques have been suggested. [sent-11, score-0.105]

7 Whenever dynamic image enlargement is needed, such as on some web pages, super-resolution techniques can be utilised. [sent-12, score-0.302]

8 This paper focuses on the issue of how to increase the resolution of a single image using only prior information about images in general, and not relying on a speciﬁc training set or the use of multiple images. [sent-13, score-0.795]

9 They range from simple use of Gaussian or preferably median ﬁltering, to supervised learning methods based on learning image patches corresponding to low resolution regions from training data, and eﬀectively sewing these patches together in a consistent manner. [sent-15, score-1.103]

10 There is a demand for methods which are reasonably fast, which are generic in that they do not rely on having suitable training data, but which do better than standard linear ﬁlters or interpolation methods. [sent-17, score-0.154]

11 This paper describes an approach to resolution doubling which achieves this. [sent-18, score-0.545]

12 The method is structurally related to one layer of the dynamic tree model [9, 8, 1] except that it uses real valued variables. [sent-19, score-0.189]

13 2 Related work Simple approaches to resolution enhancement have been around for some time. [sent-20, score-0.549]

14 Gaussian and Wiener ﬁlters (and a host of other linear ﬁlters) have been used for smoothing the blockiness created by the low resolution image. [sent-21, score-0.74]

15 Interpolation methods such as cubicspline interpolation tend to be the most common image enhancement approach. [sent-23, score-0.382]

16 Many authors are interested in reconstruction based on multiple slightly perturbed subsamples from an image [3, 2] . [sent-25, score-0.257]

17 They follow a supervised approach, learning a low to high resolution patch model (or rather storing examples of such maps), and utilising a Markov random ﬁeld for combining them and loopy propagation for inference. [sent-31, score-0.714]

18 There are two primary diﬃculties with smoothing (eg Gaussian, Wiener, Median ﬁlters) or interpolation (bicubic, cubic spline) methods. [sent-34, score-0.203]

19 It occurs both within the gradual change in colour of the sky, say, as well as across the horizon, producing blurring problems. [sent-36, score-0.231]

20 Second these approaches are inconsistent: subsampling the super-resolution image will not return the original low-resolution one. [sent-37, score-0.207]

21 Hence we need a model which maintains consistency but also tries to ensure that smoothing does not occur across region boundaries (except as much is as needed for anti-aliasing). [sent-38, score-0.106]

22 3 The model Here the high-resolution image is described by a series of very small patches with varying shapes. [sent-39, score-0.3]

23 Pixel values within these patches can vary, but will have a common mean value. [sent-40, score-0.123]

24 Apriori exactly where these patches should be is uncertain, and so the pixel to patch mapping is allowed to be a dynamic one. [sent-42, score-0.577]

25 The lowest layer consists of the visible low-resolution pixels. [sent-45, score-0.11]

26 The intermediate layer is a high-resolution image (4 × 4 the size of the low-resolution image). [sent-46, score-0.268]

27 The top layer is a latent layer which is a little more than 2 × 2 the size of the low resolution image. [sent-47, score-0.938]

28 The latent variables are ‘positioned’ at the corners, centres and edge centres of the pixels of the low resolution image. [sent-48, score-1.045]

29 The values of the pixel colour of the high resolution nodes are each a single sample from a Gaussian mixture (in colour space), where each mixture centre is given by the pixel colour of a particular parent latent Latent Hi Res Low Res Figure 1: The three layers of the model. [sent-49, score-2.352]

30 The small boxes in the left ﬁgure (64 of them) give the position of the high resolution pixels relative to the low resolution pixels (the 4 boxes with a thick outline). [sent-50, score-1.527]

31 The positions of the latent variable nodes are given by the black circles. [sent-51, score-0.3]

32 The colour of each high resolution pixel is generated from a mixture of Gaussians (right ﬁgure), each Gaussian centred at its latent parent pixel value. [sent-52, score-1.908]

33 The closer the parent is, the higher the prior probability of being generated by that mixture is. [sent-53, score-0.222]

34 The prior mixing coeﬃcients decay with distance in image space between the high-resolution node and the corresponding latent node. [sent-55, score-0.532]

35 Another way of viewing this is that a further indicator variable can be introduced which selects which mixture is responsible for a given high-resolution node. [sent-56, score-0.202]

36 We say a high resolution node ‘chooses’ to connect to the parent that is responsible for it, with a connection probability given by the corresponding mixing coeﬃcient. [sent-57, score-0.909]

37 The high-resolution pixels corresponding to a visible node can be separated into two (or more) independent regions, corresponding to pixels on diﬀerent sides of an edge (or edges). [sent-62, score-0.357]

38 A diﬀerent latent variable is responsible for each region. [sent-63, score-0.315]

39 In other words each mixture component eﬀectively corresponds to a small image patch which can vary in size depending on what pixels it is responsible for. [sent-64, score-0.537]

40 Let vj ∈ L denote a latent variable at site j in the latent space L. [sent-65, score-0.564]

41 Let xi ∈ S denote the value of pixel i in high resolution image space S, and let yk denote the value of the visible pixel k. [sent-66, score-1.559]

42 In other words the data is a linear transformation on the RGB colour values using the matrix 0. [sent-71, score-0.198]

43 the indicator for the responsibility) between the high-resolution nodes and the nodes in the latent layer. [sent-79, score-0.379]

44 Let zij denote this connectivity with zij an indicator variable taking value 1 when vj is a parent of xi in the belief network. [sent-80, score-0.599]

45 Every high resolution pixel has one and only one parent in the latent layer. [sent-81, score-1.257]

46 1 Distributions A uniform distribution over the range of pixel values is presumed for the latent variables. [sent-84, score-0.631]

47 The high resolution pixels are given by Gaussian distributions centred on the pixel values of the parental latent variable. [sent-85, score-1.301]

48 This Gaussian is presumed independent in each pixel component. [sent-86, score-0.42]

49 Finally the low resolution pixels are given by the average of the sixteen high resolution pixels covering the site of the low resolution pixel. [sent-87, score-2.051]

50 This pixel value can also be subject to some additional Gaussian noise if necessary (zero noise is assumed in this paper). [sent-88, score-0.341]

51 It is presumed that each high resolution pixel is allowed to ‘choose’ its parent from the set of latent variables in an independent manner. [sent-89, score-1.336]

52 A pixel has a higher probability of choosing a nearby parent than a far away one. [sent-90, score-0.484]

53 The integral is over Bi deﬁned as the region in image space corresponding to pixel xi . [sent-92, score-0.548]

54 The connection probabilities can be illustrated by the picture in ﬁgure 2. [sent-95, score-0.082]

55 First we have P (X|Z, V ) = ijm m (xm − vj )2 1 i exp −zij 2Ωm (2πΩm )1/2 . [sent-97, score-0.134]

56 (2) where Ωm is a variance which determines how much each pixel must be like its latent parent. [sent-98, score-0.588]

57 Here the indicator zij ensures the only contribution for each i comes from the parent j of i. [sent-99, score-0.32]

58 Second P (Y |X) = km m (yk − 1 exp − (2πΛ)1/2 1 d i∈P a(k) 2Λ xm ) 2 i (3) Figure 2: An illustration of the connection probabilities from a high resolution pixel in the position of the smaller checkered square to the latent variables centred at each of the larger squares. [sent-100, score-1.439]

59 with P a(k) denoting the set of all the d = 16 high resolution pixels which go to make up the low resolution pixel yk . [sent-102, score-1.692]

60 Λ determines the additive Gaussian noise which is in the low resolution image. [sent-104, score-0.605]

61 2 Inference The belief network deﬁned above is not tree structured (rather it is a mixture of tree structures) and so we have to resort to approximation methods for inference. [sent-108, score-0.183]

62 The posterior distribution is approximated using a factorised distribution over the latent space and over the connectivity. [sent-110, score-0.211]

63 Only in the high resolution space X do we consider joint distributions: we use a joint Gaussian for all the nodes corresponding to one low resolution pixel. [sent-111, score-1.227]

64 Here qij , µm , νj , Φm and Ψm are variational parameters to be optimised. [sent-113, score-0.307]

65 This is equivalent to maximising the negative variational free energy (or variational log likelihood) Q(Z, V, X) (6) L(Q||P ) = log P (Z, V, X, Y ) Q(Z,V,X) where Y is given by the low resolution image. [sent-115, score-0.921]

66 In this case we obtain L(Q||P ) = log Q(Z) − log P (Z) Q(Z) + log Q(V ) − log p(V ) Q(V ) + log Q(X) Q(X) − log P (X|Z, V ) Q(X,Z,V ) − log P (Y |X) Q(Y,X) . [sent-116, score-0.406]

67 Here for simplicity we only solve for qij and for the means µm i m and νj which turn out to be independent of the variational variance parameters. [sent-118, score-0.343]

68 We obtain qij xm i m m νj = i and µm = ρm + Dc(i) where ρm = qij vi (8) i i i qij i j where c(i) is the child of i, i. [sent-119, score-0.712]

69 The update for the qij is given by qij ∝ pij exp − m m (xm − vk )2 i m 2Ω (10) where the constant of proportionality is given by normalisation: j qij = 1. [sent-123, score-0.737]

70 For each Q(Z) optimisation (10), equations (8a) and (8b) are iterated a number of times. [sent-125, score-0.132]

71 Each optimisation loop is either done a preset number of times, or until a suitable convergence criterion is met. [sent-126, score-0.191]

72 The former approach is generally used, as the basic criterion is a limit on the time available for the optimisation to be done. [sent-127, score-0.164]

73 If this is not known to be zero, then it will vary from image to image, and needs to be found for each image. [sent-130, score-0.239]

74 This can be done using variational maximum likelihood, where Λ is set to maximise the variational log likelihood. [sent-131, score-0.258]

75 Σ is presumed to be independent of the images presented, and is set by hand by visualising changes on a test set. [sent-132, score-0.137]

76 To optimise automatically based on the variational log likelihood is possible but does not produce as good results due to the complicated nature of a true prior or error-measure for images. [sent-136, score-0.187]

77 For example, a highly elaborate texture oﬀset by one pixel will give a large mean square error, but look almost identical, whereas a blurred version of the texture would give a smaller mean square error, but look much worse. [sent-137, score-0.542]

78 5 Implementation The basic implementation involves setting the parameters, running the mean ﬁeld optimisation and then looking at the result. [sent-138, score-0.197]

79 The ﬁnal result is a downsampled version of the 4 × 4 image to 2 × 2 size: the larger image is used to get reasonable anti-aliasing. [sent-139, score-0.476]

80 To initialise the mean ﬁeld optimisation, X is set equal to the bi-cubic interpolated image with added Gaussian noise. [sent-140, score-0.237]

81 Although in the examples here we used 25 optimisations Q(Z), each of which involves 10 cycles through the mean ﬁeld equations for Q(X) and Q(V ), it is possible to get reasonable results with only three Q(Z) optimisation cycles each doing 2 iterations through the mean ﬁeld equations. [sent-142, score-0.299]

82 6 Demonstrations and assessment The method described in this paper is compared with a number of simple ﬁltering and interpolation methods, and also with the methods of Freeman et al. [sent-146, score-0.195]

83 The image from Freeman’s website is used for comparison with that work (ﬁgure 3). [sent-147, score-0.254]

84 Full colour comparisons for these and other images can be found at http://www. [sent-148, score-0.256]

85 (a) gives the 70x70 low resolution image, (b) the true image, (c) a bi-cubic interpolation (d) Freeman et al result (taken from website and downsampled), (e) dynamic structure super-resolution, (f) median ﬁlter. [sent-159, score-1.049]

86 compare the results of this approach with standard ﬁltering methods using a root mean squared pixel error on a set of 8, 128 by 96 colour images, giving 0. [sent-160, score-0.569]

87 0452 for the original low resolution image, bicubic interpolation, the median ﬁlter and dynamic structure super-resolution respectively. [sent-164, score-0.864]

88 Unfortunately the unavailability of code prevents representative calculations for the Freeman et al approach. [sent-165, score-0.137]

89 Dynamic structure resolution requires approximately 30 − 60 ﬂops per 2 × 2 high resolution pixel per optimisation cycle, compared with, say, 16 ﬂops for a linear ﬁlter, so it is more costly. [sent-166, score-1.536]

90 Qualitatively, the results for dynamic structure super-resolution are signiﬁcantly better than most standard ﬁltering approaches. [sent-169, score-0.095]

91 The texture is better represented because it maintains consistency, and the edges are sharper, although there is still some signiﬁcant diﬀerence from the true image. [sent-170, score-0.102]

92 The method of Freeman et al is perhaps comparable at this resolution, although it should be noted that their result has been downsampled here to half the size of their enhanced image. [sent-171, score-0.132]

93 Their method can produce 4 × 4 the resolution of the original, and so this does not accurately represent the full power of their technique. [sent-172, score-0.501]

94 Furthermore this image is representative of early results from their work. [sent-173, score-0.245]

95 However their approach does require learning large numbers of patches from a training set. [sent-174, score-0.093]

96 Fundamentally the dynamic structure super-resolution approach does a good job at resolution doubling without the need for representative training data. [sent-175, score-0.678]

97 The edges are not blurred and much of the blockiness is removed. [sent-176, score-0.149]

98 Dynamic structure super-resolution provides a technique for resolution enhancement, and provides an interesting starting model which is diﬀerent from the Markov random ﬁeld approaches. [sent-177, score-0.501]

99 A Bayesian approach to image expansion for improved deﬁnition. [sent-224, score-0.207]

100 Dynamic trees: A structured variational method giving eﬃcient propagation rules. [sent-229, score-0.13]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('resolution', 0.501), ('pixel', 0.341), ('latent', 0.211), ('qij', 0.207), ('image', 0.207), ('colour', 0.198), ('freeman', 0.147), ('parent', 0.143), ('optimisation', 0.132), ('zij', 0.129), ('interpolation', 0.127), ('pixels', 0.125), ('median', 0.105), ('low', 0.104), ('variational', 0.1), ('dynamic', 0.095), ('patches', 0.093), ('xm', 0.091), ('vj', 0.084), ('presumed', 0.079), ('dk', 0.079), ('smoothing', 0.076), ('responsible', 0.075), ('ltering', 0.071), ('lters', 0.067), ('pij', 0.066), ('edinburgh', 0.063), ('downsampled', 0.062), ('jones', 0.062), ('centred', 0.062), ('high', 0.061), ('layer', 0.061), ('nodes', 0.06), ('bicubic', 0.059), ('blockiness', 0.059), ('blurred', 0.059), ('yk', 0.059), ('node', 0.058), ('images', 0.058), ('eld', 0.058), ('log', 0.058), ('di', 0.056), ('wiener', 0.054), ('ops', 0.052), ('centres', 0.052), ('exp', 0.05), ('reconstruction', 0.05), ('mixture', 0.05), ('visible', 0.049), ('gaussian', 0.048), ('indicator', 0.048), ('patch', 0.048), ('enhancement', 0.048), ('website', 0.047), ('rj', 0.047), ('ectively', 0.047), ('astronomy', 0.047), ('km', 0.044), ('responsibility', 0.044), ('doubling', 0.044), ('connection', 0.044), ('schultz', 0.042), ('forrest', 0.042), ('gure', 0.041), ('texture', 0.041), ('erent', 0.039), ('representative', 0.038), ('ordered', 0.038), ('boxes', 0.038), ('picture', 0.038), ('al', 0.037), ('belief', 0.037), ('cycles', 0.036), ('corners', 0.036), ('informatics', 0.036), ('variance', 0.036), ('involves', 0.035), ('hill', 0.035), ('coe', 0.035), ('assessment', 0.035), ('position', 0.034), ('tree', 0.033), ('producing', 0.033), ('et', 0.033), ('ij', 0.032), ('vary', 0.032), ('criterion', 0.032), ('edges', 0.031), ('res', 0.031), ('mean', 0.03), ('maintains', 0.03), ('structured', 0.03), ('division', 0.03), ('variable', 0.029), ('calculations', 0.029), ('site', 0.029), ('prior', 0.029), ('lter', 0.028), ('suitable', 0.027), ('mixing', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 74 nips-2002-Dynamic Structure Super-Resolution

Author: Amos J. Storkey

2 0.47034681 39 nips-2002-Bayesian Image Super-Resolution

Author: Michael E. Tipping, Christopher M. Bishop

Abstract: The extraction of a single high-quality image from a set of lowresolution images is an important problem which arises in fields such as remote sensing, surveillance, medical imaging and the extraction of still images from video. Typical approaches are based on the use of cross-correlation to register the images followed by the inversion of the transformation from the unknown high resolution image to the observed low resolution images, using regularization to resolve the ill-posed nature of the inversion process. In this paper we develop a Bayesian treatment of the super-resolution problem in which the likelihood function for the image registration parameters is based on a marginalization over the unknown high-resolution image. This approach allows us to estimate the unknown point spread function, and is rendered tractable through the introduction of a Gaussian process prior over images. Results indicate a significant improvement over techniques based on MAP (maximum a-posteriori) point optimization of the high resolution image and associated registration parameters. 1

3 0.17023678 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning

Author: Stella X. Yu, Ralph Gross, Jianbo Shi

Abstract: Segmentation and recognition have long been treated as two separate processes. We propose a mechanism based on spectral graph partitioning that readily combine the two processes into one. A part-based recognition system detects object patches, supplies their partial segmentations as well as knowledge about the spatial configurations of the object. The goal of patch grouping is to find a set of patches that conform best to the object configuration, while the goal of pixel grouping is to find a set of pixels that have the best low-level feature similarity. Through pixel-patch interactions and between-patch competition encoded in the solution space, these two processes are realized in one joint optimization problem. The globally optimal partition is obtained by solving a constrained eigenvalue problem. We demonstrate that the resulting object segmentation eliminates false positives for the part detection, while overcoming occlusion and weak contours for the low-level edge detection.

4 0.15431678 10 nips-2002-A Model for Learning Variance Components of Natural Images

Author: Yan Karklin, Michael S. Lewicki

Abstract: We present a hierarchical Bayesian model for learning efﬁcient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefﬁcients with a distribution that is adapted to the variance structure of the coefﬁcients of an efﬁcient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.

5 0.12997369 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

Author: Max Welling, Simon Osindero, Geoffrey E. Hinton

Abstract: We propose a model for natural images in which the probability of an image is proportional to the product of the probabilities of some ﬁlter outputs. We encourage the system to ﬁnd sparse features by using a Studentt distribution to model each ﬁlter output. If the t-distribution is used to model the combined outputs of sets of neurally adjacent ﬁlters, the system learns a topographic map in which the orientation, spatial frequency and location of the ﬁlters change smoothly across the map. Even though maximum likelihood learning is intractable in our model, the product form allows a relatively efﬁcient learning procedure that works well even for highly overcomplete sets of ﬁlters. Once the model has been learned it can be used as a prior to derive the “iterated Wiener ﬁlter” for the purpose of denoising images.

6 0.12655644 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

7 0.12007726 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes

8 0.11534295 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks

9 0.11172606 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search

10 0.11028598 87 nips-2002-Fast Transformation-Invariant Factor Analysis

11 0.10620281 173 nips-2002-Recovering Intrinsic Images from a Single Image

12 0.10434531 21 nips-2002-Adaptive Classification by Variational Kalman Filtering

13 0.10426193 202 nips-2002-Unsupervised Color Constancy

14 0.10213223 126 nips-2002-Learning Sparse Multiscale Image Representations

15 0.097330637 73 nips-2002-Dynamic Bayesian Networks with Deterministic Latent Tables

16 0.092934385 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities

17 0.089726083 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

18 0.088553198 177 nips-2002-Retinal Processing Emulation in a Programmable 2-Layer Analog Array Processor CMOS Chip

19 0.084629677 182 nips-2002-Shape Recipes: Scene Representations that Refer to the Image

20 0.083010599 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.234), (1, 0.013), (2, -0.058), (3, 0.38), (4, -0.049), (5, 0.055), (6, 0.133), (7, 0.064), (8, -0.012), (9, -0.079), (10, 0.054), (11, -0.079), (12, 0.089), (13, 0.006), (14, 0.068), (15, -0.07), (16, 0.113), (17, 0.097), (18, 0.066), (19, 0.15), (20, -0.08), (21, 0.078), (22, -0.154), (23, 0.057), (24, 0.085), (25, 0.26), (26, 0.079), (27, -0.068), (28, -0.184), (29, 0.034), (30, 0.032), (31, -0.038), (32, 0.07), (33, -0.015), (34, 0.202), (35, 0.141), (36, -0.018), (37, -0.058), (38, 0.086), (39, 0.032), (40, 0.055), (41, 0.088), (42, 0.086), (43, 0.041), (44, 0.11), (45, 0.036), (46, 0.001), (47, -0.009), (48, 0.044), (49, 0.037)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98432559 74 nips-2002-Dynamic Structure Super-Resolution

Author: Amos J. Storkey

2 0.93413591 39 nips-2002-Bayesian Image Super-Resolution

Author: Michael E. Tipping, Christopher M. Bishop

3 0.55319774 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes

Author: Anat Levin, Assaf Zomet, Yair Weiss

Abstract: Certain simple images are known to trigger a percept of transparency: the input image I is perceived as the sum of two images I(x, y) = I1 (x, y) + I2 (x, y). This percept is puzzling. First, why do we choose the “more complicated” description with two images rather than the “simpler” explanation I(x, y) = I1 (x, y) + 0 ? Second, given the inﬁnite number of ways to express I as a sum of two images, how do we compute the “best” decomposition ? Here we suggest that transparency is the rational percept of a system that is adapted to the statistics of natural scenes. We present a probabilistic model of images based on the qualitative statistics of derivative ﬁlters and “corner detectors” in natural scenes and use this model to ﬁnd the most probable decomposition of a novel image. The optimization is performed using loopy belief propagation. We show that our model computes perceptually “correct” decompositions on synthetic images and discuss its application to real images. 1

4 0.55182117 182 nips-2002-Shape Recipes: Scene Representations that Refer to the Image

Author: William T. Freeman, Antonio Torralba

Abstract: The goal of low-level vision is to estimate an underlying scene, given an observed image. Real-world scenes (eg, albedos or shapes) can be very complex, conventionally requiring high dimensional representations which are hard to estimate and store. We propose a low-dimensional representation, called a scene recipe, that relies on the image itself to describe the complex scene conﬁgurations. Shape recipes are an example: these are the regression coefﬁcients that predict the bandpassed shape from image data. We describe the beneﬁts of this representation, and show two uses illustrating their properties: (1) we improve stereo shape estimates by learning shape recipes at low resolution and applying them at full resolution; (2) Shape recipes implicitly contain information about lighting and materials and we use them for material segmentation.

5 0.52804351 87 nips-2002-Fast Transformation-Invariant Factor Analysis

Author: Anitha Kannan, Nebojsa Jojic, Brendan J. Frey

Abstract: Dimensionality reduction techniques such as principal component analysis and factor analysis are used to discover a linear mapping between high dimensional data samples and points in a lower dimensional subspace. In [6], Jojic and Frey introduced mixture of transformation-invariant component analyzers (MTCA) that can account for global transformations such as translations and rotations, perform clustering and learn local appearance deformations by dimensionality reduction. However, due to enormous computational requirements of the EM algorithm for learning the model, O( ) where is the dimensionality of a data sample, MTCA was not practical for most applications. In this paper, we demonstrate how fast Fourier transforms can reduce the computation to the order of log . With this speedup, we show the effectiveness of MTCA in various applications - tracking, video textures, clustering video sequences, object recognition, and object detection in images. ¡ ¤ ¤ ¤ ¤

6 0.51243949 150 nips-2002-Multiple Cause Vector Quantization

7 0.4767535 173 nips-2002-Recovering Intrinsic Images from a Single Image

8 0.44076231 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

9 0.40615374 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

10 0.39426526 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning

11 0.37169567 126 nips-2002-Learning Sparse Multiscale Image Representations

12 0.37029296 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search

13 0.34024611 202 nips-2002-Unsupervised Color Constancy

14 0.33769074 10 nips-2002-A Model for Learning Variance Components of Natural Images

15 0.33661175 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks

16 0.30610093 21 nips-2002-Adaptive Classification by Variational Kalman Filtering

17 0.30230713 177 nips-2002-Retinal Processing Emulation in a Programmable 2-Layer Analog Array Processor CMOS Chip

18 0.2884728 179 nips-2002-Scaling of Probability-Based Optimization Algorithms

19 0.28039443 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex

20 0.26781359 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.016), (11, 0.027), (23, 0.033), (42, 0.089), (54, 0.113), (55, 0.028), (57, 0.011), (67, 0.021), (68, 0.035), (73, 0.227), (74, 0.155), (92, 0.047), (98, 0.118)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.89766181 13 nips-2002-A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics

Author: Eric Allender, Sanjeev Arora, Michael Kearns, Cristopher Moore, Alexander Russell

Abstract: We establish a new hardness result that shows that the difﬁculty of planning in factored Markov decision processes is representational rather than just computational. More precisely, we give a ﬁxed family of factored MDPs with linear rewards whose optimal policies and value functions simply cannot be represented succinctly in any standard parametric form. Previous hardness results indicated that computing good policies from the MDP parameters was difﬁcult, but left open the possibility of succinct function approximation for any ﬁxed factored MDP. Our result applies even to policies which yield a polynomially poor approximation to the optimal value, and highlights interesting connections with the complexity class of Arthur-Merlin games.

same-paper 2 0.87500089 74 nips-2002-Dynamic Structure Super-Resolution

Author: Amos J. Storkey

3 0.84376383 201 nips-2002-Transductive and Inductive Methods for Approximate Gaussian Process Regression

Author: Anton Schwaighofer, Volker Tresp

Abstract: Gaussian process regression allows a simple analytical treatment of exact Bayesian inference and has been found to provide good performance, yet scales badly with the number of training data. In this paper we compare several approaches towards scaling Gaussian processes regression to large data sets: the subset of representers method, the reduced rank approximation, online Gaussian processes, and the Bayesian committee machine. Furthermore we provide theoretical insight into some of our experimental results. We found that subset of representers methods can give good and particularly fast predictions for data sets with high and medium noise levels. On complex low noise data sets, the Bayesian committee machine achieves signiﬁcantly better accuracy, yet at a higher computational cost.

4 0.70953 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

Author: David R. Martin, Charless C. Fowlkes, Jitendra Malik

Abstract: The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, a classiﬁer is trained using human labeled images as ground truth. We present precision-recall curves showing that the resulting detector outperforms existing approaches.

5 0.70638567 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games

Author: Xiaofeng Wang, Tuomas Sandholm

Abstract: Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conﬂicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play. In this paper, we present optimal adaptive learning, the ﬁrst algorithm that converges to an optimal Nash equilibrium with probability 1 in any team Markov game. We provide a convergence proof, and show that the algorithm’s parameters are easy to set to meet the convergence conditions.

6 0.70351291 52 nips-2002-Cluster Kernels for Semi-Supervised Learning

7 0.70118254 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

8 0.69953251 3 nips-2002-A Convergent Form of Approximate Policy Iteration

9 0.69936442 89 nips-2002-Feature Selection by Maximum Marginal Diversity

10 0.69918084 31 nips-2002-Application of Variational Bayesian Approach to Speech Recognition

11 0.69915831 124 nips-2002-Learning Graphical Models with Mercer Kernels

12 0.69676071 135 nips-2002-Learning with Multiple Labels

13 0.69655883 163 nips-2002-Prediction and Semantic Association

14 0.69651556 39 nips-2002-Bayesian Image Super-Resolution

15 0.69555914 152 nips-2002-Nash Propagation for Loopy Graphical Games

16 0.69483531 82 nips-2002-Exponential Family PCA for Belief Compression in POMDPs

17 0.69444656 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search

18 0.69388413 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks

19 0.69339192 2 nips-2002-A Bilinear Model for Sparse Coding

20 0.6926356 162 nips-2002-Parametric Mixture Models for Multi-Labeled Text