nips nips2013 nips2013-260 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Benigno Uria, Iain Murray, Hugo Larochelle
Abstract: We introduce RNADE, a new model for joint density estimation of real-valued vectors. Our model calculates the density of a datapoint as the product of onedimensional conditionals modeled using mixture density networks with shared parameters. RNADE learns a distributed representation of the data, while having a tractable expression for the calculation of densities. A tractable likelihood allows direct comparison with other methods and training by standard gradientbased optimizers. We compare the performance of RNADE on several datasets of heterogeneous and perceptual data, finding it outperforms mixture models in all but one case. 1
Reference: text
sentIndex sentText sentNum sentScore
1 RNADE: The real-valued neural autoregressive density-estimator Benigno Uria and Iain Murray School of Informatics University of Edinburgh {b. [sent-1, score-0.333]
2 uk Hugo Larochelle D´ partement d’informatique e Universit´ de Sherbrooke e hugo. [sent-5, score-0.069]
3 ca Abstract We introduce RNADE, a new model for joint density estimation of real-valued vectors. [sent-7, score-0.14]
4 Our model calculates the density of a datapoint as the product of onedimensional conditionals modeled using mixture density networks with shared parameters. [sent-8, score-0.907]
5 RNADE learns a distributed representation of the data, while having a tractable expression for the calculation of densities. [sent-9, score-0.161]
6 A tractable likelihood allows direct comparison with other methods and training by standard gradientbased optimizers. [sent-10, score-0.155]
7 We compare the performance of RNADE on several datasets of heterogeneous and perceptual data, finding it outperforms mixture models in all but one case. [sent-11, score-0.377]
8 1 Introduction Probabilistic approaches to machine learning involve modeling the probability distributions over large collections of variables. [sent-12, score-0.181]
9 The number of parameters required to describe a general discrete distribution grows exponentially in its dimensionality, so some structure or regularity must be imposed, often through graphical models [e. [sent-13, score-0.194]
10 Graphical models are also used to describe probability densities over collections of real-valued variables. [sent-16, score-0.212]
11 Often parts of a task-specific probabilistic model are hard to specify, and are learned from data using generic models. [sent-17, score-0.096]
12 For example, the natural probabilistic approach to image restoration tasks (such as denoising, deblurring, inpainting) requires a multivariate distribution over uncorrupted patches of pixels. [sent-18, score-0.326]
13 It has long been appreciated that large classes of densities can be estimated consistently by kernel density estimation [2], and a large mixture of Gaussians can closely represent any density. [sent-19, score-0.566]
14 In practice, a parametric mixture of Gaussians seems to fit the distribution over patches of pixels and obtains state-of-the-art restorations [3]. [sent-20, score-0.46]
15 It may not be possible to fit small image patches significantly better, but alternative models could further test this claim. [sent-21, score-0.205]
16 Moreover, competitive alternatives to mixture models might improve performance in other applications that have insufficient training data to fit mixture models well. [sent-22, score-0.647]
17 Restricted Boltzmann Machines (RBMs), which are undirected graphical models, fit samples of binary vectors from a range of sources better than mixture models [4, 5]. [sent-23, score-0.433]
18 One explanation is that RBMs form a distributed representation: many hidden units are active when explaining an observation, which is a better match to most real data than a single mixture component. [sent-24, score-0.501]
19 Another explanation is that RBMs are mixture models, but the number of components is exponential in the number of hidden units. [sent-25, score-0.403]
20 Parameter tying among components allows these more flexible models to generalize better from small numbers of examples. [sent-26, score-0.188]
21 There are two practical difficulties with RBMs: the likelihood of the model must be approximated, and samples can only be drawn from the model approximately by Gibbs sampling. [sent-27, score-0.028]
22 The Neural Autoregressive Distribution Estimator (NADE) overcomes these difficulties [5]. [sent-28, score-0.06]
23 NADE is a directed graphical model, or feed-forward neural network, initially derived as an approximation to an RBM, but then fitted as a model in its own right. [sent-29, score-0.176]
24 An autoregressive model expresses the density of a vector as an ordered product of one-dimensional distributions, each conditioned on the values of previous dimensions in the (perhaps arbitrary) ordering. [sent-31, score-0.599]
25 We use the parameter sharing previously introduced by NADE, combined with mixture density networks [6], an existing flexible approach to modeling real-valued distributions with neural networks. [sent-32, score-0.551]
26 By construction, the density of a test point under RNADE is cheap to compute, unlike RBM-based models. [sent-33, score-0.185]
27 The neural network structure provides a flexible way to alter the mean and variance of a mixture component depending on context, potentially modeling non-linear or heteroscedastic data with fewer components than unconstrained mixture models. [sent-34, score-0.815]
wordName wordTfidf (topN-words)
[('rnade', 0.63), ('nade', 0.36), ('autoregressive', 0.301), ('mixture', 0.244), ('rbms', 0.225), ('density', 0.14), ('patches', 0.124), ('exible', 0.092), ('collections', 0.088), ('appreciated', 0.079), ('gradientbased', 0.079), ('densities', 0.076), ('graphical', 0.075), ('onedimensional', 0.073), ('deblurring', 0.073), ('explanation', 0.073), ('inpainting', 0.069), ('partement', 0.069), ('tying', 0.069), ('culties', 0.069), ('gaussians', 0.067), ('uncorrupted', 0.065), ('heteroscedastic', 0.065), ('datapoint', 0.065), ('iain', 0.065), ('hugo', 0.063), ('informatique', 0.06), ('conditionals', 0.06), ('overcomes', 0.06), ('restoration', 0.06), ('edinburgh', 0.058), ('calculates', 0.055), ('explaining', 0.055), ('informatics', 0.053), ('alter', 0.052), ('factorizes', 0.052), ('rbm', 0.052), ('expresses', 0.051), ('murray', 0.051), ('product', 0.049), ('larochelle', 0.049), ('models', 0.048), ('tractable', 0.048), ('components', 0.046), ('regularity', 0.046), ('imposed', 0.046), ('xd', 0.045), ('cheap', 0.045), ('insuf', 0.044), ('probabilistic', 0.044), ('universit', 0.043), ('perceptual', 0.043), ('tted', 0.043), ('heterogeneous', 0.042), ('estimator', 0.041), ('hidden', 0.04), ('rule', 0.039), ('denoising', 0.039), ('modeling', 0.039), ('unconstrained', 0.037), ('initially', 0.037), ('boltzmann', 0.036), ('alternatives', 0.036), ('sharing', 0.036), ('school', 0.035), ('sources', 0.035), ('units', 0.034), ('obtains', 0.033), ('image', 0.033), ('directed', 0.032), ('ordered', 0.032), ('neural', 0.032), ('pixels', 0.031), ('undirected', 0.031), ('networks', 0.031), ('calculation', 0.03), ('distributions', 0.029), ('learns', 0.029), ('distributed', 0.029), ('network', 0.029), ('likelihood', 0.028), ('machines', 0.028), ('parametric', 0.028), ('specify', 0.027), ('competitive', 0.027), ('consistently', 0.027), ('gibbs', 0.027), ('parts', 0.027), ('fewer', 0.027), ('perhaps', 0.027), ('match', 0.026), ('shared', 0.026), ('chain', 0.026), ('conditioned', 0.026), ('exponentially', 0.025), ('generic', 0.025), ('involve', 0.025), ('representation', 0.025), ('generalize', 0.025), ('modeled', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
Author: Benigno Uria, Iain Murray, Hugo Larochelle
Abstract: We introduce RNADE, a new model for joint density estimation of real-valued vectors. Our model calculates the density of a datapoint as the product of onedimensional conditionals modeled using mixture density networks with shared parameters. RNADE learns a distributed representation of the data, while having a tractable expression for the calculation of densities. A tractable likelihood allows direct comparison with other methods and training by standard gradientbased optimizers. We compare the performance of RNADE on several datasets of heterogeneous and perceptual data, finding it outperforms mixture models in all but one case. 1
2 0.10319473 315 nips-2013-Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs
Author: Yann Dauphin, Yoshua Bengio
Abstract: Sparse high-dimensional data vectors are common in many application domains where a very large number of rarely non-zero features can be devised. Unfortunately, this creates a computational bottleneck for unsupervised feature learning algorithms such as those based on auto-encoders and RBMs, because they involve a reconstruction step where the whole input vector is predicted from the current feature values. An algorithm was recently developed to successfully handle the case of auto-encoders, based on an importance sampling scheme stochastically selecting which input elements to actually reconstruct during training for each particular example. To generalize this idea to RBMs, we propose a stochastic ratio-matching algorithm that inherits all the computational advantages and unbiasedness of the importance sampling scheme. We show that stochastic ratio matching is a good estimator, allowing the approach to beat the state-of-the-art on two bag-of-word text classification benchmarks (20 Newsgroups and RCV1), while keeping computational cost linear in the number of non-zeros. 1
3 0.093624093 331 nips-2013-Top-Down Regularization of Deep Belief Networks
Author: Hanlin Goh, Nicolas Thome, Matthieu Cord, Joo-Hwee Lim
Abstract: Designing a principled and effective algorithm for learning deep architectures is a challenging problem. The current approach involves two training phases: a fully unsupervised learning followed by a strongly discriminative optimization. We suggest a deep learning strategy that bridges the gap between the two phases, resulting in a three-phase learning procedure. We propose to implement the scheme using a method to regularize deep belief networks with top-down information. The network is constructed from building blocks of restricted Boltzmann machines learned by combining bottom-up and top-down sampled signals. A global optimization procedure that merges samples from a forward bottom-up pass and a top-down pass is used. Experiments on the MNIST dataset show improvements over the existing algorithms for deep belief networks. Object recognition results on the Caltech-101 dataset also yield competitive results. 1
4 0.07458546 36 nips-2013-Annealing between distributions by averaging moments
Author: Roger B. Grosse, Chris J. Maddison, Ruslan Salakhutdinov
Abstract: Many powerful Monte Carlo techniques for estimating partition functions, such as annealed importance sampling (AIS), are based on sampling from a sequence of intermediate distributions which interpolate between a tractable initial distribution and the intractable target distribution. The near-universal practice is to use geometric averages of the initial and target distributions, but alternative paths can perform substantially better. We present a novel sequence of intermediate distributions for exponential families defined by averaging the moments of the initial and target distributions. We analyze the asymptotic performance of both the geometric and moment averages paths and derive an asymptotically optimal piecewise linear schedule. AIS with moment averaging performs well empirically at estimating partition functions of restricted Boltzmann machines (RBMs), which form the building blocks of many deep learning models. 1
5 0.072705559 221 nips-2013-On the Expressive Power of Restricted Boltzmann Machines
Author: James Martens, Arkadev Chattopadhya, Toni Pitassi, Richard Zemel
Abstract: This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)? We characterize the RBM’s unnormalized log-likelihood function as a type of neural network, and through a series of simulation results relate these networks to ones whose representational properties are better understood. We show the surprising result that RBMs can efficiently capture any distribution whose density depends on the number of 1’s in their input. We also provide the first known example of a particular type of distribution that provably cannot be efficiently represented by an RBM, assuming a realistic exponential upper bound on the weights. By formally demonstrating that a relatively simple distribution cannot be represented efficiently by an RBM our results provide a new rigorous justification for the use of potentially more expressive generative models, such as deeper ones. 1
6 0.072338425 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
7 0.069493189 344 nips-2013-Using multiple samples to learn mixture models
8 0.066636443 229 nips-2013-Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation
9 0.063263677 18 nips-2013-A simple example of Dirichlet process mixture inconsistency for the number of components
10 0.063238099 160 nips-2013-Learning Stochastic Feedforward Neural Networks
11 0.062495131 127 nips-2013-Generalized Denoising Auto-Encoders as Generative Models
12 0.061991617 167 nips-2013-Learning the Local Statistics of Optical Flow
13 0.058806598 243 nips-2013-Parallel Sampling of DP Mixture Models using Sub-Cluster Splits
14 0.056463916 192 nips-2013-Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation
15 0.055148188 351 nips-2013-What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach
16 0.049550787 5 nips-2013-A Deep Architecture for Matching Short Texts
17 0.04881981 212 nips-2013-Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty
18 0.04774864 251 nips-2013-Predicting Parameters in Deep Learning
19 0.046146628 75 nips-2013-Convex Two-Layer Modeling
20 0.046133988 281 nips-2013-Robust Low Rank Kernel Embeddings of Multivariate Distributions
topicId topicWeight
[(0, 0.105), (1, 0.06), (2, -0.056), (3, -0.028), (4, 0.04), (5, 0.026), (6, 0.052), (7, 0.03), (8, 0.03), (9, -0.058), (10, 0.039), (11, 0.015), (12, -0.045), (13, 0.051), (14, 0.048), (15, 0.065), (16, 0.016), (17, -0.093), (18, -0.047), (19, -0.011), (20, -0.007), (21, 0.062), (22, 0.06), (23, -0.03), (24, 0.0), (25, -0.015), (26, -0.103), (27, 0.006), (28, -0.063), (29, -0.008), (30, 0.055), (31, 0.073), (32, 0.028), (33, -0.031), (34, -0.016), (35, -0.021), (36, 0.065), (37, 0.073), (38, -0.09), (39, 0.112), (40, -0.033), (41, 0.003), (42, -0.076), (43, -0.052), (44, -0.036), (45, -0.08), (46, -0.02), (47, 0.023), (48, 0.124), (49, -0.028)]
simIndex simValue paperId paperTitle
same-paper 1 0.90935254 260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
Author: Benigno Uria, Iain Murray, Hugo Larochelle
Abstract: We introduce RNADE, a new model for joint density estimation of real-valued vectors. Our model calculates the density of a datapoint as the product of onedimensional conditionals modeled using mixture density networks with shared parameters. RNADE learns a distributed representation of the data, while having a tractable expression for the calculation of densities. A tractable likelihood allows direct comparison with other methods and training by standard gradientbased optimizers. We compare the performance of RNADE on several datasets of heterogeneous and perceptual data, finding it outperforms mixture models in all but one case. 1
2 0.60911375 36 nips-2013-Annealing between distributions by averaging moments
Author: Roger B. Grosse, Chris J. Maddison, Ruslan Salakhutdinov
Abstract: Many powerful Monte Carlo techniques for estimating partition functions, such as annealed importance sampling (AIS), are based on sampling from a sequence of intermediate distributions which interpolate between a tractable initial distribution and the intractable target distribution. The near-universal practice is to use geometric averages of the initial and target distributions, but alternative paths can perform substantially better. We present a novel sequence of intermediate distributions for exponential families defined by averaging the moments of the initial and target distributions. We analyze the asymptotic performance of both the geometric and moment averages paths and derive an asymptotically optimal piecewise linear schedule. AIS with moment averaging performs well empirically at estimating partition functions of restricted Boltzmann machines (RBMs), which form the building blocks of many deep learning models. 1
3 0.55937159 315 nips-2013-Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs
Author: Yann Dauphin, Yoshua Bengio
Abstract: Sparse high-dimensional data vectors are common in many application domains where a very large number of rarely non-zero features can be devised. Unfortunately, this creates a computational bottleneck for unsupervised feature learning algorithms such as those based on auto-encoders and RBMs, because they involve a reconstruction step where the whole input vector is predicted from the current feature values. An algorithm was recently developed to successfully handle the case of auto-encoders, based on an importance sampling scheme stochastically selecting which input elements to actually reconstruct during training for each particular example. To generalize this idea to RBMs, we propose a stochastic ratio-matching algorithm that inherits all the computational advantages and unbiasedness of the importance sampling scheme. We show that stochastic ratio matching is a good estimator, allowing the approach to beat the state-of-the-art on two bag-of-word text classification benchmarks (20 Newsgroups and RCV1), while keeping computational cost linear in the number of non-zeros. 1
4 0.52242458 221 nips-2013-On the Expressive Power of Restricted Boltzmann Machines
Author: James Martens, Arkadev Chattopadhya, Toni Pitassi, Richard Zemel
Abstract: This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)? We characterize the RBM’s unnormalized log-likelihood function as a type of neural network, and through a series of simulation results relate these networks to ones whose representational properties are better understood. We show the surprising result that RBMs can efficiently capture any distribution whose density depends on the number of 1’s in their input. We also provide the first known example of a particular type of distribution that provably cannot be efficiently represented by an RBM, assuming a realistic exponential upper bound on the weights. By formally demonstrating that a relatively simple distribution cannot be represented efficiently by an RBM our results provide a new rigorous justification for the use of potentially more expressive generative models, such as deeper ones. 1
5 0.51835299 167 nips-2013-Learning the Local Statistics of Optical Flow
Author: Dan Rosenbaum, Daniel Zoran, Yair Weiss
Abstract: Motivated by recent progress in natural image statistics, we use newly available datasets with ground truth optical flow to learn the local statistics of optical flow and compare the learned models to prior models assumed by computer vision researchers. We find that a Gaussian mixture model (GMM) with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. We investigate the source of the GMM’s success and show it is related to an explicit representation of flow boundaries. We also learn a model that jointly models the local intensity pattern and the local optical flow. In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal. 1
6 0.51834625 18 nips-2013-A simple example of Dirichlet process mixture inconsistency for the number of components
7 0.50097954 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
8 0.48215881 127 nips-2013-Generalized Denoising Auto-Encoders as Generative Models
9 0.47050029 331 nips-2013-Top-Down Regularization of Deep Belief Networks
10 0.45980051 160 nips-2013-Learning Stochastic Feedforward Neural Networks
11 0.45314223 204 nips-2013-Multiscale Dictionary Learning for Estimating Conditional Distributions
12 0.45191932 229 nips-2013-Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation
13 0.44450906 5 nips-2013-A Deep Architecture for Matching Short Texts
14 0.42384899 344 nips-2013-Using multiple samples to learn mixture models
15 0.4234961 37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
16 0.42107895 351 nips-2013-What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach
17 0.41413265 298 nips-2013-Small-Variance Asymptotics for Hidden Markov Models
18 0.403413 80 nips-2013-Data-driven Distributionally Robust Polynomial Optimization
19 0.40190026 200 nips-2013-Multi-Prediction Deep Boltzmann Machines
20 0.38911662 87 nips-2013-Density estimation from unweighted k-nearest neighbor graphs: a roadmap
topicId topicWeight
[(16, 0.034), (33, 0.192), (34, 0.109), (41, 0.028), (49, 0.083), (56, 0.066), (70, 0.019), (75, 0.269), (85, 0.042), (89, 0.024), (93, 0.027)]
simIndex simValue paperId paperTitle
1 0.83023334 21 nips-2013-Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths
Author: Stefan Mathe, Cristian Sminchisescu
Abstract: Human eye movements provide a rich source of information into the human visual information processing. The complex interplay between the task and the visual stimulus is believed to determine human eye movements, yet it is not fully understood, making it difficult to develop reliable eye movement prediction systems. Our work makes three contributions towards addressing this problem. First, we complement one of the largest and most challenging static computer vision datasets, VOC 2012 Actions, with human eye movement recordings collected under the primary task constraint of action recognition, as well as, separately, for context recognition, in order to analyze the impact of different tasks. Our dataset is unique among the eyetracking datasets of still images in terms of large scale (over 1 million fixations recorded in 9157 images) and different task controls. Second, we propose Markov models to automatically discover areas of interest (AOI) and introduce novel sequential consistency metrics based on them. Our methods can automatically determine the number, the spatial support and the transitions between AOIs, in addition to their locations. Based on such encodings, we quantitatively show that given unconstrained read-world stimuli, task instructions have significant influence on the human visual search patterns and are stable across subjects. Finally, we leverage powerful machine learning techniques and computer vision features in order to learn task-sensitive reward functions from eye movement data within models that allow to effectively predict the human visual search patterns based on inverse optimal control. The methodology achieves state of the art scanpath modeling results. 1
same-paper 2 0.79306406 260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
Author: Benigno Uria, Iain Murray, Hugo Larochelle
Abstract: We introduce RNADE, a new model for joint density estimation of real-valued vectors. Our model calculates the density of a datapoint as the product of onedimensional conditionals modeled using mixture density networks with shared parameters. RNADE learns a distributed representation of the data, while having a tractable expression for the calculation of densities. A tractable likelihood allows direct comparison with other methods and training by standard gradientbased optimizers. We compare the performance of RNADE on several datasets of heterogeneous and perceptual data, finding it outperforms mixture models in all but one case. 1
3 0.78786939 302 nips-2013-Sparse Inverse Covariance Estimation with Calibration
Author: Tuo Zhao, Han Liu
Abstract: We propose a semiparametric method for estimating sparse precision matrix of high dimensional elliptical distribution. The proposed method calibrates regularizations when estimating each column of the precision matrix. Thus it not only is asymptotically tuning free, but also achieves an improved finite sample performance. Theoretically, we prove that the proposed method achieves the parametric rates of convergence in both parameter estimation and model selection. We present numerical results on both simulated and real datasets to support our theory and illustrate the effectiveness of the proposed estimator. 1
4 0.7738499 347 nips-2013-Variational Planning for Graph-based MDPs
Author: Qiang Cheng, Qiang Liu, Feng Chen, Alex Ihler
Abstract: Markov Decision Processes (MDPs) are extremely useful for modeling and solving sequential decision making problems. Graph-based MDPs provide a compact representation for MDPs with large numbers of random variables. However, the complexity of exactly solving a graph-based MDP usually grows exponentially in the number of variables, which limits their application. We present a new variational framework to describe and solve the planning problem of MDPs, and derive both exact and approximate planning algorithms. In particular, by exploiting the graph structure of graph-based MDPs, we propose a factored variational value iteration algorithm in which the value function is first approximated by the multiplication of local-scope value functions, then solved by minimizing a Kullback-Leibler (KL) divergence. The KL divergence is optimized using the belief propagation algorithm, with complexity exponential in only the cluster size of the graph. Experimental comparison on different models shows that our algorithm outperforms existing approximation algorithms at finding good policies. 1
5 0.77131802 166 nips-2013-Learning invariant representations and applications to face verification
Author: Qianli Liao, Joel Z. Leibo, Tomaso Poggio
Abstract: One approach to computer object recognition and modeling the brain’s ventral stream involves unsupervised learning of representations that are invariant to common transformations. However, applications of these ideas have usually been limited to 2D affine transformations, e.g., translation and scaling, since they are easiest to solve via convolution. In accord with a recent theory of transformationinvariance [1], we propose a model that, while capturing other common convolutional networks as special cases, can also be used with arbitrary identitypreserving transformations. The model’s wiring can be learned from videos of transforming objects—or any other grouping of images into sets by their depicted object. Through a series of successively more complex empirical tests, we study the invariance/discriminability properties of this model with respect to different transformations. First, we empirically confirm theoretical predictions (from [1]) for the case of 2D affine transformations. Next, we apply the model to non-affine transformations; as expected, it performs well on face verification tasks requiring invariance to the relatively smooth transformations of 3D rotation-in-depth and changes in illumination direction. Surprisingly, it can also tolerate clutter “transformations” which map an image of a face on one background to an image of the same face on a different background. Motivated by these empirical findings, we tested the same model on face verification benchmark tasks from the computer vision literature: Labeled Faces in the Wild, PubFig [2, 3, 4] and a new dataset we gathered—achieving strong performance in these highly unconstrained cases as well. 1
7 0.6663667 303 nips-2013-Sparse Overlapping Sets Lasso for Multitask Learning and its Application to fMRI Analysis
8 0.66550779 345 nips-2013-Variance Reduction for Stochastic Gradient Optimization
9 0.66366881 262 nips-2013-Real-Time Inference for a Gamma Process Model of Neural Spiking
10 0.66339904 286 nips-2013-Robust learning of low-dimensional dynamics from large neural ensembles
11 0.6600861 341 nips-2013-Universal models for binary spike patterns using centered Dirichlet processes
12 0.66005969 301 nips-2013-Sparse Additive Text Models with Low Rank Background
13 0.65743124 294 nips-2013-Similarity Component Analysis
14 0.65699339 331 nips-2013-Top-Down Regularization of Deep Belief Networks
15 0.65603113 287 nips-2013-Scalable Inference for Logistic-Normal Topic Models
16 0.65455639 200 nips-2013-Multi-Prediction Deep Boltzmann Machines
17 0.65354609 64 nips-2013-Compete to Compute
18 0.65345335 236 nips-2013-Optimal Neural Population Codes for High-dimensional Stimulus Variables
19 0.65245652 173 nips-2013-Least Informative Dimensions
20 0.65176386 183 nips-2013-Mapping paradigm ontologies to and from the brain