nips nips2013 nips2013-167 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Dan Rosenbaum, Daniel Zoran, Yair Weiss
Abstract: Motivated by recent progress in natural image statistics, we use newly available datasets with ground truth optical flow to learn the local statistics of optical flow and compare the learned models to prior models assumed by computer vision researchers. We find that a Gaussian mixture model (GMM) with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. We investigate the source of the GMM’s success and show it is related to an explicit representation of flow boundaries. We also learn a model that jointly models the local intensity pattern and the local optical flow. In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal. 1
Reference: text
sentIndex sentText sentNum sentScore
1 il 1 Abstract Motivated by recent progress in natural image statistics, we use newly available datasets with ground truth optical flow to learn the local statistics of optical flow and compare the learned models to prior models assumed by computer vision researchers. [sent-4, score-1.213]
2 We find that a Gaussian mixture model (GMM) with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. [sent-5, score-0.164]
3 We also learn a model that jointly models the local intensity pattern and the local optical flow. [sent-7, score-0.86]
4 In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. [sent-8, score-0.471]
5 However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal. [sent-9, score-0.465]
6 We leverage these newly available resources to learn the statistics of optical flow and compare this to assumptions used by computer vision researchers. [sent-11, score-0.496]
7 Interestingly, the best models in terms of log likelihood, when used as priors in image restoration tasks, also yield state-of-the-art performance [14]. [sent-14, score-0.242]
8 A notable example is the computation of optical flow: a vector at every pixel that corresponds to the two dimensional projection of the motion 1 at that pixel. [sent-16, score-0.475]
9 Since local motion information is often ambiguous, nearly all optical flow estimation algorithms work by minimizing a cost function that has two terms: a local data term and a “prior” term (see. [sent-17, score-0.593]
10 Given the success in image restoration tasks, where learned priors give state-of-the-art performance, one might expect a similar story in optical flow estimation. [sent-21, score-0.591]
11 However, with the notable exception of [9] (which served as a motivating example for this work and is discussed below) there have been very few attempts to learn priors for optical flow by modeling local statistics. [sent-22, score-0.51]
12 Instead, the state-ofthe-art methods still use priors that were formulated by computer vision researchers. [sent-23, score-0.177]
13 In fact, two of the top performing methods in modern optical flow benchmarks use a hand-defined smoothness constraint that was suggested over 20 years ago [6, 2]. [sent-24, score-0.495]
14 One big difference between image statistics and flow statistics is the availability of ground truth data. [sent-25, score-0.263]
15 Whereas for modeling image statistics one merely needs a collection of photographs (so that the amount of data is essentially unlimited these days), for modeling flow statistics one needs to obtain the ground truth motion of the points in the scene. [sent-26, score-0.349]
16 In the past, the lack of availability of ground truth data did not allow for learning an optical flow prior from examples. [sent-27, score-0.559]
17 The Sintel dataset (figure 1) consists of a thousand pairs of frames from a highly realistic computer graphics film with a wide variety of locations and motion types. [sent-29, score-0.195]
18 The vehicle was equipped with accurate range finders as well as accurate localization of its own motion, and the combination of these two sources allow computing optical flow for points that are stationary in the world. [sent-32, score-0.363]
19 Although this is real data, it is sparse (only about 50% of the pixels have ground truth flow). [sent-33, score-0.154]
20 In this paper we leverage the availability of ground truth datasets to learn explicit statistical models of optical flow. [sent-34, score-0.578]
21 We compare our learned model to the assumptions made by computer vision algorithms for estimating flow. [sent-35, score-0.159]
22 We find that a Gaussian mixture model with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. [sent-36, score-0.164]
23 We also learn a model that jointly models the local intensity pattern and the local optical flow. [sent-38, score-0.86]
24 In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. [sent-39, score-0.471]
25 However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal. [sent-40, score-0.465]
26 1 Priors for optical flow One of the earliest methods for optical flow that is still used in applications is the celebrated LucasKanade algorithm [7]. [sent-42, score-0.684]
27 It overcomes the local ambiguity of motion analysis by assuming that the optical flow is constant within a small image patch and finds this constant motion by least-squares estimation. [sent-43, score-0.788]
28 It finds the optical flow by minimizing a cost function that has a data term and a “smoothness” term. [sent-45, score-0.342]
29 Denoting by u the horizontal flow and v the vertical flow, the smoothness term is of the form: 2 2 u2 + u2 + vx + vy x y JHS = x,y where ux , uy are the spatial derivatives of the horizontal flow u and vx , vy are the spatial derivatives of the vertical flow v. [sent-46, score-0.582]
30 Rather than using a quadratic smoothness term, many authors have advocated using more robust terms that would be less sensitive to outliers in smoothness. [sent-48, score-0.144]
31 Both the Lorentzian and the absolute value robust smoothness 2 terms were shown to outperform quadratic smoothness in [11] and the absolute value was better among the two robust terms. [sent-51, score-0.288]
32 Several authors have also suggested that the smoothness term be based on the local intensity pattern, since motion discontinuities are more likely to occur at intensity boundaries. [sent-52, score-1.08]
33 Ren [8] modified the weights in the Lucas and Kanade least-squares estimation so that pixels that are on different sides of an intensity boundary will get lower weights. [sent-53, score-0.411]
34 Perhaps the simplest such regularizer is of the form: 2 2 w(Ix )(u2 + vx ) + w(Iy )(u2 + vy ) x y JHSI = (1) x,y As we discuss below, this prior can be seen as a Gaussian prior on the flow that is conditioned on the intensity. [sent-55, score-0.23]
35 They used a training set of optical flow obtained by simulating the motion of a camera in natural range images. [sent-57, score-0.475]
36 The prior learned by their system was similar to a robust smoothness prior, but the filters are not local derivatives but rather more random-looking high pass filters. [sent-58, score-0.347]
37 One way to answer this question is to use these priors as a basis for an optical flow estimation algorithm and see which algorithm gives the best performance. [sent-61, score-0.428]
38 [11] reported that adding a non-local smoothness term to a robust smoothness prior significantly improved results on the Middlebury benchmark, while Geiger et al. [sent-64, score-0.299]
39 Perhaps the main difficulty with this approach is that the prior is only one part of an optical flow estimation algorithm. [sent-66, score-0.382]
40 Motivated by recent advances in natural image statistics and the availability of new datasets, we compare different priors in terms of (1) log likelihood on held-out data and (2) inference performance with tractable posteriors. [sent-70, score-0.263]
41 2 Comparing priors as density models In order to compare different prior models as density models, we generate a training set and test set of optical flow patches from the ground truth databases. [sent-72, score-0.91]
42 Denoting by f a single vector that concatenates all the optical flow in a patch (e. [sent-73, score-0.419]
43 Given a prior probability model Pr(f ; θ) we use the training set to estimate the free parameters of the model θ and then we measure the log likelihood of held out patches from the test set. [sent-76, score-0.396]
44 From Sintel, we divided the pairs of frames for which ground truth is available into 708 pairs which we used for training and 333 pairs which we used for testing. [sent-77, score-0.169]
45 We created a second test set from the KITTI dataset by choosing a subset of patches for which full ground truth flow was available. [sent-79, score-0.37]
46 By exponentiating JHS we see that Pr(f ; θ) is a multidimensional Gaussian with covariance matrix λDDT where D is a 256 × 128 derivative matrix that computes the derivatives of the flow field at each pixel and λ is the weight given to the prior relative to the data term. [sent-85, score-0.152]
47 Motivated by the success of GMMs in modeling natural image statistics [14] we use the training set to estimate GMM priors for optical flow. [sent-93, score-0.517]
48 Each mixture component is a multidimensional Gaussian with full covariance matrix and zero mean and we vary the number of components between 1 and 64. [sent-94, score-0.168]
49 Even with a few mixture components, the GMM has far more free parameters than the previous models but note that we are measuring success on held out patches so that models that overfit should be penalized. [sent-96, score-0.411]
50 One interesting thing that can be seen is that the local statistics validate some assumptions commonly used by computer vision researchers. [sent-98, score-0.232]
51 For example, the Horn and Shunck smoothness prior is as good as the optimal Gaussian prior (GMM1) even though it uses local first derivatives. [sent-99, score-0.254]
52 It can be seen that the GMM is indeed much better than the other priors including cases where the test set is taken from KITTI (rather than Sintel) and when the patch size is 12 × 12 rather than 8 × 8. [sent-104, score-0.183]
53 5 log-likelihood 4 3 2 1 0 LK HS L1 GMM1 GMM2 GMM4 Models GMM8 GMM16 GMM64 Figure 2: mean log likelihood of the different models for 8 × 8 patches extracted from held out data from Sintel. [sent-105, score-0.392]
54 1 Comparing models using tractable inference A second way of comparing the models is by their ability to restore corrupted patches of optical flow. [sent-108, score-0.654]
55 We are not claiming that optical flow restoration is a real-world application (although using priors to “fill in” holes in optical flow is quite common, e. [sent-109, score-0.816]
56 3 The secret of the GMM We now take a deeper look at how the GMM models optical flow patches. [sent-126, score-0.378]
57 The first (and not surprising) thing we found is that the covariance matrices learned by the model are block diagonal (so that the u and v components are independent given the assignment to a particular component). [sent-127, score-0.153]
58 More insight can be gained by considering the GMM as a local subspace model: a patch which is generated by component k is generated as a linear combination of the eigenvectors of the kth covariance. [sent-128, score-0.264]
59 The coefficients of the linear combination have energy that decays with the eigenvalue: so each patch can be well approximated by the leading eigenvectors of the corresponding covariance. [sent-129, score-0.138]
60 Unlike global subspace models, different subspace models can be used for different patches, and during inference with the model one can infer which local subspace is most likely to have generated the patch. [sent-130, score-0.253]
61 Figure 6 shows the dominant leading eigenvectors of all 32 covariance matrices in the GMM32 model: the eigenvectors of u are followed by the eigenvectors of v. [sent-131, score-0.205]
62 The right hand half of each row shows (u,v) patches that are sampled from that Gaussian. [sent-134, score-0.24]
63 It can be seen that the first 10 components or so model very smooth components (in fact the samples appear to be completely flat). [sent-137, score-0.14]
64 This can also be seen by comparing the v samples on the top row which are close to gray with those in the next two rows which are much closer to black or white (since the models are zero mean, black and white are equally likely for any component). [sent-139, score-0.18]
65 Thus these components are very similar to a non-local smoothness assumption similar to the one suggested in [11]): they not only assume that derivatives are small but they assume that all the 8 × 8 patch is constant. [sent-141, score-0.345]
66 However, unlike the suggestion in [11] to enforce non-local smoothness by applying a median filter at all pixels, the GMM only applies non-local smoothness at a subset of patches that are inferred to be generated by such components. [sent-142, score-0.47]
67 we see that the components no longer model flat components but rather motion boundaries. [sent-144, score-0.253]
68 For example, the bottom row of the figure illustrates a component that seems to generate primarily diagonal motion boundaries. [sent-146, score-0.16]
69 Interestingly, such local subspace models of optical flow have also been suggested by Fleet et al. [sent-147, score-0.515]
70 They used synthetic models of moving occlusion boundaries and bars to learn linear subspace models of the flow. [sent-149, score-0.186]
71 The GMM seems to support their intuition that learning separate linear subspace models for flat vs motion boundary is a good idea. [sent-150, score-0.278]
72 6 leading eigenvectors u patch samples v u v Figure 6: The eigenvectors and samples of the GMM components. [sent-154, score-0.199]
73 GMM is better because it explicitly models edges and flat patches separately. [sent-155, score-0.276]
74 4 A joint model for optical flow and intensity As mentioned in the introduction, many authors have suggested modifying the smoothness assumption by conditioning it on the local intensity pattern and giving a higher penalty for motion discontinuities in the absence of intensity discontinuities. [sent-156, score-1.771]
75 We therefore ask, does conditioning on the local intensity give better log likelihood on held out flow patches? [sent-157, score-0.539]
76 We evaluated two flow models that are conditioned on the local intensity pattern. [sent-159, score-0.433]
77 The second one is a Gaussian mixture model that simultaneously models both intensity and flow. [sent-164, score-0.378]
78 The simultaneous GMM we use includes a 200 component GMM to model the intensity together with a 64 dimensional GMM to model the flow. [sent-165, score-0.345]
79 We allow a dependence between the hidden variable of the intensity GMM and that of the flow GMM. [sent-166, score-0.337]
80 This is equivalent to a hidden Markov model (HMM) with 2 hidden variables: one represents the intensity component and one represents the flow component (figure 8). [sent-167, score-0.41]
81 Initialization is given by independent GMMs learned for the intensity (we actually use the one learned by [14] which is available on their website) and for the flow. [sent-169, score-0.416]
82 The intensity GMM is not changed during the learning. [sent-170, score-0.318]
83 Conditioned on the intensity pattern, the flow distribution is still a GMM with 64 components (as in the previous section) but the mixing weights depend on the intensity. [sent-171, score-0.378]
84 Conditioning on the intensity gives basically zero improvement in log likelihood and a slight improvement in flow denoising only for very large amounts of noise. [sent-174, score-0.428]
85 To investigate this effect, we examine the transition matrix between the intensity components and the flow components (figure 8). [sent-176, score-0.457]
86 If intensity and flow were independent, we would expect all rows of the transition matrix to be the same. [sent-177, score-0.365]
87 If an intensity boundary always lead to a flow boundary, we would expect the bottom rows of the matrix to have only one nonzero element. [sent-178, score-0.415]
88 7 Regardless of whether the intensity component corresponds to a boundary or not, the most likely flow components are flat. [sent-180, score-0.512]
89 When there is an intensity boundary, the flow boundary in the same orientation becomes more likely. [sent-181, score-0.387]
90 To rule out that this effect is due to a local optimum found by EM, we conducted additional experiments whereby the emission probabilities were held fixed to the GMMs learned independently for flow and motion and each patch in the training set was assigned one intensity and one flow component. [sent-183, score-0.687]
91 We then estimated the joint distribution over flow and motion components by simply counting the relative frequency in the training set. [sent-184, score-0.193]
92 In summary, while our learned model supports the standard intuition that motion boundaries are more likely at intensity boundaries, it suggests that when dealing with a large dataset with high variability, there is very little benefit (if any) in conditioning flow models on the local intensity. [sent-186, score-0.73]
93 un-conditional mixing-weights h intensity 50 intensity conditional mixing-weights 100 150 200 10 20 30 40 h flow 50 60 Figure 8: Left: the transition matrix learned by the HMM. [sent-189, score-0.704]
94 Conditioned on an intensity boundary, motion boundaries become more likely but are still less likely than a flat motion. [sent-191, score-0.578]
95 In this paper, we have leveraged the availability of large ground truth databases to learn priors from data and compare our learned models to the assumptions typically made by computer vision researchers. [sent-193, score-0.481]
96 the Horn and Schunck model is close to the optimal Gaussian model, robust models are better, intensity discontinuities make motion discontinuities more likely). [sent-196, score-0.638]
97 However, a learned GMM model with 64 components significantly outperforms the standard models used in computer vision, primarily because it explicitly distinguishes between flat patches and boundary patches and then uses a different form of nonlocal smoothness for the different cases. [sent-197, score-0.832]
98 A framework for the robust estimation of optical flow. [sent-206, score-0.371]
99 A naturalistic open source movie for optical flow evaluation. [sent-212, score-0.342]
100 Design and use of linear models for image motion analysis. [sent-218, score-0.213]
wordName wordTfidf (topN-words)
[('ow', 0.564), ('gmm', 0.35), ('optical', 0.342), ('intensity', 0.318), ('patches', 0.24), ('hs', 0.146), ('motion', 0.133), ('kitti', 0.123), ('sintel', 0.122), ('smoothness', 0.115), ('lk', 0.095), ('horn', 0.09), ('psnr', 0.088), ('priors', 0.086), ('patch', 0.077), ('boundary', 0.069), ('vision', 0.068), ('truth', 0.068), ('ground', 0.062), ('discontinuities', 0.061), ('eigenvectors', 0.061), ('gure', 0.06), ('components', 0.06), ('local', 0.059), ('vx', 0.057), ('derivatives', 0.055), ('vy', 0.053), ('fraction', 0.053), ('schunck', 0.052), ('held', 0.051), ('boundaries', 0.051), ('learned', 0.049), ('availability', 0.047), ('restoration', 0.046), ('conditioning', 0.046), ('denoising', 0.045), ('image', 0.044), ('hmm', 0.044), ('gmms', 0.042), ('prior', 0.04), ('inpainting', 0.04), ('lucas', 0.04), ('flow', 0.04), ('subspace', 0.04), ('frames', 0.039), ('suggested', 0.038), ('likely', 0.038), ('fleet', 0.038), ('models', 0.036), ('likelihood', 0.035), ('ddt', 0.035), ('hsi', 0.035), ('jba', 0.035), ('jhs', 0.035), ('lorentzian', 0.035), ('shunck', 0.035), ('uy', 0.035), ('multidimensional', 0.035), ('ix', 0.034), ('gaussian', 0.033), ('roth', 0.033), ('bls', 0.031), ('log', 0.03), ('black', 0.029), ('robust', 0.029), ('kanade', 0.028), ('rows', 0.028), ('component', 0.027), ('yair', 0.025), ('pixels', 0.024), ('geiger', 0.024), ('hamiltonian', 0.024), ('success', 0.024), ('michael', 0.024), ('mixture', 0.024), ('pattern', 0.023), ('learn', 0.023), ('computer', 0.023), ('stefan', 0.022), ('annealed', 0.022), ('unconditional', 0.022), ('ux', 0.022), ('thing', 0.022), ('accordance', 0.022), ('lters', 0.022), ('covariance', 0.022), ('ask', 0.021), ('horizontal', 0.021), ('statistics', 0.021), ('vehicle', 0.021), ('iid', 0.021), ('conditioned', 0.02), ('seen', 0.02), ('transition', 0.019), ('hidden', 0.019), ('dependency', 0.019), ('vertical', 0.019), ('assumptions', 0.019), ('driving', 0.019), ('daniel', 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 167 nips-2013-Learning the Local Statistics of Optical Flow
Author: Dan Rosenbaum, Daniel Zoran, Yair Weiss
Abstract: Motivated by recent progress in natural image statistics, we use newly available datasets with ground truth optical flow to learn the local statistics of optical flow and compare the learned models to prior models assumed by computer vision researchers. We find that a Gaussian mixture model (GMM) with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. We investigate the source of the GMM’s success and show it is related to an explicit representation of flow boundaries. We also learn a model that jointly models the local intensity pattern and the local optical flow. In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal. 1
2 0.14165442 128 nips-2013-Generalized Method-of-Moments for Rank Aggregation
Author: Hossein Azari Soufiani, William Chen, David C. Parkes, Lirong Xia
Abstract: In this paper we propose a class of efficient Generalized Method-of-Moments (GMM) algorithms for computing parameters of the Plackett-Luce model, where the data consists of full rankings over alternatives. Our technique is based on breaking the full rankings into pairwise comparisons, and then computing parameters that satisfy a set of generalized moment conditions. We identify conditions for the output of GMM to be unique, and identify a general class of consistent and inconsistent breakings. We then show by theory and experiments that our algorithms run significantly faster than the classical Minorize-Maximization (MM) algorithm, while achieving competitive statistical efficiency. 1
3 0.11351682 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
Author: Carl Doersch, Abhinav Gupta, Alexei A. Efros
Abstract: Recent work on mid-level visual representations aims to capture information at the level of complexity higher than typical “visual words”, but lower than full-blown semantic objects. Several approaches [5, 6, 12, 23] have been proposed to discover mid-level visual elements, that are both 1) representative, i.e., frequently occurring within a visual dataset, and 2) visually discriminative. However, the current approaches are rather ad hoc and difficult to analyze and evaluate. In this work, we pose visual element discovery as discriminative mode seeking, drawing connections to the the well-known and well-studied mean-shift algorithm [2, 1, 4, 8]. Given a weakly-labeled image collection, our method discovers visually-coherent patch clusters that are maximally discriminative with respect to the labels. One advantage of our formulation is that it requires only a single pass through the data. We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset of [5]. We also evaluate our method on the task of scene classification, demonstrating state-of-the-art performance on the MIT Scene-67 dataset. 1
4 0.10330119 5 nips-2013-A Deep Architecture for Matching Short Texts
Author: Zhengdong Lu, Hang Li
Abstract: Many machine learning problems can be interpreted as learning for matching two types of objects (e.g., images and captions, users and products, queries and documents, etc.). The matching level of two objects is usually measured as the inner product in a certain feature space, while the modeling effort focuses on mapping of objects from the original space to the feature space. This schema, although proven successful on a range of matching tasks, is insufficient for capturing the rich structure in the matching process of more complicated objects. In this paper, we propose a new deep architecture to more effectively model the complicated matching relations between two objects from heterogeneous domains. More specifically, we apply this model to matching tasks in natural language, e.g., finding sensible responses for a tweet, or relevant answers to a given question. This new architecture naturally combines the localness and hierarchy intrinsic to the natural language problems, and therefore greatly improves upon the state-of-the-art models. 1
5 0.10042195 351 nips-2013-What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach
Author: Zhenwen Dai, Georgios Exarchakis, Jörg Lücke
Abstract: We study optimal image encoding based on a generative approach with non-linear feature combinations and explicit position encoding. By far most approaches to unsupervised learning of visual features, such as sparse coding or ICA, account for translations by representing the same features at different positions. Some earlier models used a separate encoding of features and their positions to facilitate invariant data encoding and recognition. All probabilistic generative models with explicit position encoding have so far assumed a linear superposition of components to encode image patches. Here, we for the first time apply a model with non-linear feature superposition and explicit position encoding for patches. By avoiding linear superpositions, the studied model represents a closer match to component occlusions which are ubiquitous in natural images. In order to account for occlusions, the non-linear model encodes patches qualitatively very different from linear models by using component representations separated into mask and feature parameters. We first investigated encodings learned by the model using artificial data with mutually occluding components. We find that the model extracts the components, and that it can correctly identify the occlusive components with the hidden variables of the model. On natural image patches, the model learns component masks and features for typical image components. By using reverse correlation, we estimate the receptive fields associated with the model’s hidden units. We find many Gabor-like or globular receptive fields as well as fields sensitive to more complex structures. Our results show that probabilistic models that capture occlusions and invariances can be trained efficiently on image patches, and that the resulting encoding represents an alternative model for the neural encoding of images in the primary visual cortex. 1
6 0.072156966 114 nips-2013-Extracting regions of interest from biological images with convolutional sparse block coding
7 0.068875983 83 nips-2013-Deep Fisher Networks for Large-Scale Image Classification
8 0.063993305 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning
9 0.061991617 260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
10 0.059507526 229 nips-2013-Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation
11 0.056729496 187 nips-2013-Memoized Online Variational Inference for Dirichlet Process Mixture Models
12 0.056475408 27 nips-2013-Adaptive Multi-Column Deep Neural Networks with Application to Robust Image Denoising
13 0.05613276 251 nips-2013-Predicting Parameters in Deep Learning
14 0.054748435 150 nips-2013-Learning Adaptive Value of Information for Structured Prediction
15 0.053582739 329 nips-2013-Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections
16 0.053351946 173 nips-2013-Least Informative Dimensions
17 0.052571081 212 nips-2013-Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty
18 0.050100893 37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
19 0.048378795 286 nips-2013-Robust learning of low-dimensional dynamics from large neural ensembles
20 0.047290396 178 nips-2013-Locally Adaptive Bayesian Multivariate Time Series
topicId topicWeight
[(0, 0.13), (1, 0.063), (2, -0.047), (3, -0.024), (4, 0.011), (5, -0.004), (6, 0.003), (7, 0.02), (8, -0.028), (9, -0.014), (10, -0.05), (11, 0.043), (12, -0.022), (13, 0.045), (14, -0.017), (15, 0.056), (16, 0.001), (17, -0.116), (18, -0.049), (19, 0.002), (20, 0.006), (21, 0.015), (22, -0.01), (23, -0.035), (24, -0.014), (25, -0.034), (26, 0.037), (27, 0.047), (28, 0.019), (29, 0.015), (30, 0.033), (31, 0.095), (32, 0.038), (33, 0.037), (34, -0.101), (35, 0.049), (36, 0.05), (37, 0.053), (38, -0.008), (39, 0.042), (40, -0.003), (41, -0.069), (42, -0.043), (43, -0.108), (44, -0.063), (45, -0.002), (46, -0.141), (47, 0.15), (48, -0.015), (49, -0.023)]
simIndex simValue paperId paperTitle
same-paper 1 0.93539369 167 nips-2013-Learning the Local Statistics of Optical Flow
Author: Dan Rosenbaum, Daniel Zoran, Yair Weiss
Abstract: Motivated by recent progress in natural image statistics, we use newly available datasets with ground truth optical flow to learn the local statistics of optical flow and compare the learned models to prior models assumed by computer vision researchers. We find that a Gaussian mixture model (GMM) with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. We investigate the source of the GMM’s success and show it is related to an explicit representation of flow boundaries. We also learn a model that jointly models the local intensity pattern and the local optical flow. In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal. 1
2 0.76287121 351 nips-2013-What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach
Author: Zhenwen Dai, Georgios Exarchakis, Jörg Lücke
Abstract: We study optimal image encoding based on a generative approach with non-linear feature combinations and explicit position encoding. By far most approaches to unsupervised learning of visual features, such as sparse coding or ICA, account for translations by representing the same features at different positions. Some earlier models used a separate encoding of features and their positions to facilitate invariant data encoding and recognition. All probabilistic generative models with explicit position encoding have so far assumed a linear superposition of components to encode image patches. Here, we for the first time apply a model with non-linear feature superposition and explicit position encoding for patches. By avoiding linear superpositions, the studied model represents a closer match to component occlusions which are ubiquitous in natural images. In order to account for occlusions, the non-linear model encodes patches qualitatively very different from linear models by using component representations separated into mask and feature parameters. We first investigated encodings learned by the model using artificial data with mutually occluding components. We find that the model extracts the components, and that it can correctly identify the occlusive components with the hidden variables of the model. On natural image patches, the model learns component masks and features for typical image components. By using reverse correlation, we estimate the receptive fields associated with the model’s hidden units. We find many Gabor-like or globular receptive fields as well as fields sensitive to more complex structures. Our results show that probabilistic models that capture occlusions and invariances can be trained efficiently on image patches, and that the resulting encoding represents an alternative model for the neural encoding of images in the primary visual cortex. 1
3 0.59778124 114 nips-2013-Extracting regions of interest from biological images with convolutional sparse block coding
Author: Marius Pachitariu, Adam M. Packer, Noah Pettit, Henry Dalgleish, Michael Hausser, Maneesh Sahani
Abstract: Biological tissue is often composed of cells with similar morphologies replicated throughout large volumes and many biological applications rely on the accurate identification of these cells and their locations from image data. Here we develop a generative model that captures the regularities present in images composed of repeating elements of a few different types. Formally, the model can be described as convolutional sparse block coding. For inference we use a variant of convolutional matching pursuit adapted to block-based representations. We extend the KSVD learning algorithm to subspaces by retaining several principal vectors from the SVD decomposition instead of just one. Good models with little cross-talk between subspaces can be obtained by learning the blocks incrementally. We perform extensive experiments on simulated images and the inference algorithm consistently recovers a large proportion of the cells with a small number of false positives. We fit the convolutional model to noisy GCaMP6 two-photon images of spiking neurons and to Nissl-stained slices of cortical tissue and show that it recovers cell body locations without supervision. The flexibility of the block-based representation is reflected in the variability of the recovered cell shapes. 1
4 0.58749962 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
Author: Carl Doersch, Abhinav Gupta, Alexei A. Efros
Abstract: Recent work on mid-level visual representations aims to capture information at the level of complexity higher than typical “visual words”, but lower than full-blown semantic objects. Several approaches [5, 6, 12, 23] have been proposed to discover mid-level visual elements, that are both 1) representative, i.e., frequently occurring within a visual dataset, and 2) visually discriminative. However, the current approaches are rather ad hoc and difficult to analyze and evaluate. In this work, we pose visual element discovery as discriminative mode seeking, drawing connections to the the well-known and well-studied mean-shift algorithm [2, 1, 4, 8]. Given a weakly-labeled image collection, our method discovers visually-coherent patch clusters that are maximally discriminative with respect to the labels. One advantage of our formulation is that it requires only a single pass through the data. We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset of [5]. We also evaluate our method on the task of scene classification, demonstrating state-of-the-art performance on the MIT Scene-67 dataset. 1
5 0.58239633 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning
Author: Chen-Ping Yu, Wen-Yu Hua, Dimitris Samaras, Greg Zelinsky
Abstract: Visual clutter, the perception of an image as being crowded and disordered, affects aspects of our lives ranging from object detection to aesthetics, yet relatively little effort has been made to model this important and ubiquitous percept. Our approach models clutter as the number of proto-objects segmented from an image, with proto-objects defined as groupings of superpixels that are similar in intensity, color, and gradient orientation features. We introduce a novel parametric method of clustering superpixels by modeling mixture of Weibulls on Earth Mover’s Distance statistics, then taking the normalized number of proto-objects following partitioning as our estimate of clutter perception. We validated this model using a new 90-image dataset of real world scenes rank ordered by human raters for clutter, and showed that our method not only predicted clutter extremely well (Spearman’s ρ = 0.8038, p < 0.001), but also outperformed all existing clutter perception models and even a behavioral object segmentation ground truth. We conclude that the number of proto-objects in an image affects clutter perception more than the number of objects or features. 1
6 0.58143777 37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
7 0.56166494 128 nips-2013-Generalized Method-of-Moments for Rank Aggregation
8 0.55382216 329 nips-2013-Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections
9 0.54915553 212 nips-2013-Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty
10 0.54469305 260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
11 0.51043814 153 nips-2013-Learning Feature Selection Dependencies in Multi-task Learning
12 0.49101037 256 nips-2013-Probabilistic Principal Geodesic Analysis
13 0.47242278 187 nips-2013-Memoized Online Variational Inference for Dirichlet Process Mixture Models
14 0.46699238 5 nips-2013-A Deep Architecture for Matching Short Texts
15 0.45331883 229 nips-2013-Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation
16 0.44584173 83 nips-2013-Deep Fisher Networks for Large-Scale Image Classification
17 0.41404933 350 nips-2013-Wavelets on Graphs via Deep Learning
18 0.41303343 27 nips-2013-Adaptive Multi-Column Deep Neural Networks with Application to Robust Image Denoising
19 0.40635344 204 nips-2013-Multiscale Dictionary Learning for Estimating Conditional Distributions
20 0.398527 53 nips-2013-Bayesian inference for low rank spatiotemporal neural receptive fields
topicId topicWeight
[(2, 0.02), (16, 0.021), (33, 0.159), (34, 0.167), (41, 0.028), (49, 0.022), (56, 0.091), (70, 0.016), (72, 0.268), (85, 0.036), (89, 0.034), (93, 0.042), (95, 0.011)]
simIndex simValue paperId paperTitle
1 0.87702519 126 nips-2013-Gaussian Process Conditional Copulas with Applications to Financial Time Series
Author: José Miguel Hernández-Lobato, James R. Lloyd, Daniel Hernández-Lobato
Abstract: The estimation of dependencies between multiple variables is a central problem in the analysis of financial time series. A common approach is to express these dependencies in terms of a copula function. Typically the copula function is assumed to be constant but this may be inaccurate when there are covariates that could have a large influence on the dependence structure of the data. To account for this, a Bayesian framework for the estimation of conditional copulas is proposed. In this framework the parameters of a copula are non-linearly related to some arbitrary conditioning variables. We evaluate the ability of our method to predict time-varying dependencies on several equities and currencies and observe consistent performance gains compared to static copula models and other timevarying copula methods. 1
2 0.84559333 263 nips-2013-Reasoning With Neural Tensor Networks for Knowledge Base Completion
Author: Richard Socher, Danqi Chen, Christopher D. Manning, Andrew Ng
Abstract: Knowledge bases are an important resource for question answering and other tasks but often suffer from incompleteness and lack of ability to reason over their discrete entities and relationships. In this paper we introduce an expressive neural tensor network suitable for reasoning over relationships between two entities. Previous work represented entities as either discrete atomic units or with a single entity vector representation. We show that performance can be improved when entities are represented as an average of their constituting word vectors. This allows sharing of statistical strength between, for instance, facts involving the “Sumatran tiger” and “Bengal tiger.” Lastly, we demonstrate that all models improve when these word vectors are initialized with vectors learned from unsupervised large corpora. We assess the model by considering the problem of predicting additional true relations between entities given a subset of the knowledge base. Our model outperforms previous models and can classify unseen relationships in WordNet and FreeBase with an accuracy of 86.2% and 90.0%, respectively. 1
3 0.80556571 244 nips-2013-Parametric Task Learning
Author: Ichiro Takeuchi, Tatsuya Hongo, Masashi Sugiyama, Shinichi Nakajima
Abstract: We introduce an extended formulation of multi-task learning (MTL) called parametric task learning (PTL) that can systematically handle infinitely many tasks parameterized by a continuous parameter. Our key finding is that, for a certain class of PTL problems, the path of the optimal task-wise solutions can be represented as piecewise-linear functions of the continuous task parameter. Based on this fact, we employ a parametric programming technique to obtain the common shared representation across all the continuously parameterized tasks. We show that our PTL formulation is useful in various scenarios such as learning under non-stationarity, cost-sensitive learning, and quantile regression. We demonstrate the advantage of our approach in these scenarios.
same-paper 4 0.80324757 167 nips-2013-Learning the Local Statistics of Optical Flow
Author: Dan Rosenbaum, Daniel Zoran, Yair Weiss
Abstract: Motivated by recent progress in natural image statistics, we use newly available datasets with ground truth optical flow to learn the local statistics of optical flow and compare the learned models to prior models assumed by computer vision researchers. We find that a Gaussian mixture model (GMM) with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. We investigate the source of the GMM’s success and show it is related to an explicit representation of flow boundaries. We also learn a model that jointly models the local intensity pattern and the local optical flow. In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal. 1
5 0.76099187 262 nips-2013-Real-Time Inference for a Gamma Process Model of Neural Spiking
Author: David Carlson, Vinayak Rao, Joshua T. Vogelstein, Lawrence Carin
Abstract: With simultaneous measurements from ever increasing populations of neurons, there is a growing need for sophisticated tools to recover signals from individual neurons. In electrophysiology experiments, this classically proceeds in a two-step process: (i) threshold the waveforms to detect putative spikes and (ii) cluster the waveforms into single units (neurons). We extend previous Bayesian nonparametric models of neural spiking to jointly detect and cluster neurons using a Gamma process model. Importantly, we develop an online approximate inference scheme enabling real-time analysis, with performance exceeding the previous state-of-theart. Via exploratory data analysis—using data with partial ground truth as well as two novel data sets—we find several features of our model collectively contribute to our improved performance including: (i) accounting for colored noise, (ii) detecting overlapping spikes, (iii) tracking waveform dynamics, and (iv) using multiple channels. We hope to enable novel experiments simultaneously measuring many thousands of neurons and possibly adapting stimuli dynamically to probe ever deeper into the mysteries of the brain. 1
6 0.72795743 336 nips-2013-Translating Embeddings for Modeling Multi-relational Data
7 0.68675131 173 nips-2013-Least Informative Dimensions
8 0.6810813 346 nips-2013-Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression
9 0.67992485 286 nips-2013-Robust learning of low-dimensional dynamics from large neural ensembles
10 0.6786359 239 nips-2013-Optimistic policy iteration and natural actor-critic: A unifying view and a non-optimality result
11 0.6784997 201 nips-2013-Multi-Task Bayesian Optimization
12 0.6768837 294 nips-2013-Similarity Component Analysis
13 0.67508745 115 nips-2013-Factorized Asymptotic Bayesian Inference for Latent Feature Models
14 0.67492962 152 nips-2013-Learning Efficient Random Maximum A-Posteriori Predictors with Non-Decomposable Loss Functions
15 0.67408228 143 nips-2013-Integrated Non-Factorized Variational Inference
16 0.67387754 53 nips-2013-Bayesian inference for low rank spatiotemporal neural receptive fields
17 0.67287904 229 nips-2013-Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation
18 0.6726706 312 nips-2013-Stochastic Gradient Riemannian Langevin Dynamics on the Probability Simplex
19 0.67216867 348 nips-2013-Variational Policy Search via Trajectory Optimization
20 0.67161065 278 nips-2013-Reward Mapping for Transfer in Long-Lived Agents