nips nips2013 knowledge-graph by maker-knowledge-mining
1 nips-2013-(More) Efficient Reinforcement Learning via Posterior Sampling
Author: Ian Osband, Dan Russo, Benjamin Van Roy
Abstract: Most provably-efficient reinforcement learning algorithms introduce optimism about poorly-understood states and actions to encourage exploration. We study an alternative approach for efficient exploration: posterior sampling for reinforcement learning (PSRL). This algorithm proceeds in repeated episodes of known duration. At the start of each episode, PSRL updates a prior distribution over Markov decision processes and takes one sample from this posterior. PSRL then follows the policy that is optimal for this sample during the episode. The algorithm is conceptually simple, computationally efficient and allows an √ agent to encode prior knowledge ˜ in a natural way. We establish an O(τ S AT ) bound on expected regret, where T is time, τ is the episode length and S and A are the cardinalities of the state and action spaces. This bound is one of the first for an algorithm not based on optimism, and close to the state of the art for any reinforcement learning algorithm. We show through simulation that PSRL significantly outperforms existing algorithms with similar regret bounds. 1
Author: Abhradeep Guha Thakurta, Adam Smith
Abstract: We give differentially private algorithms for a large class of online learning algorithms, in both the full information and bandit settings. Our algorithms aim to minimize a convex loss function which is a sum of smaller convex loss terms, one for each data point. To design our algorithms, we modify the popular mirror descent approach, or rather a variant called follow the approximate leader. The technique leads to the first nonprivate algorithms for private online learning in the bandit setting. In the full information setting, our algorithms improve over the regret bounds of previous work (due to Dwork, Naor, Pitassi and Rothblum (2010) and Jain, Kothari and Thakurta (2012)). In many cases, our algorithms (in both settings) match the dependence on the input length, T , of the optimal nonprivate regret bounds up to logarithmic factors in T . Our algorithms require logarithmic space and update time. 1
3 nips-2013-A* Lasso for Learning a Sparse Bayesian Network Structure for Continuous Variables
Author: Jing Xiang, Seyoung Kim
Abstract: We address the problem of learning a sparse Bayesian network structure for continuous variables in a high-dimensional space. The constraint that the estimated Bayesian network structure must be a directed acyclic graph (DAG) makes the problem challenging because of the huge search space of network structures. Most previous methods were based on a two-stage approach that prunes the search space in the first stage and then searches for a network structure satisfying the DAG constraint in the second stage. Although this approach is effective in a lowdimensional setting, it is difficult to ensure that the correct network structure is not pruned in the first stage in a high-dimensional setting. In this paper, we propose a single-stage method, called A* lasso, that recovers the optimal sparse Bayesian network structure by solving a single optimization problem with A* search algorithm that uses lasso in its scoring system. Our approach substantially improves the computational efficiency of the well-known exact methods based on dynamic programming. We also present a heuristic scheme that further improves the efficiency of A* lasso without significantly compromising the quality of solutions. We demonstrate our approach on data simulated from benchmark Bayesian networks and real data. 1
4 nips-2013-A Comparative Framework for Preconditioned Lasso Algorithms
Author: Fabian L. Wauthier, Nebojsa Jojic, Michael Jordan
Abstract: The Lasso is a cornerstone of modern multivariate data analysis, yet its performance suffers in the common situation in which covariates are correlated. This limitation has led to a growing number of Preconditioned Lasso algorithms that pre-multiply X and y by matrices PX , Py prior to running the standard Lasso. A direct comparison of these and similar Lasso-style algorithms to the original Lasso is difficult because the performance of all of these methods depends critically on an auxiliary penalty parameter λ. In this paper we propose an agnostic framework for comparing Preconditioned Lasso algorithms to the Lasso without having to choose λ. We apply our framework to three Preconditioned Lasso instances and highlight cases when they will outperform the Lasso. Additionally, our theory reveals fragilities of these algorithms to which we provide partial solutions. 1
5 nips-2013-A Deep Architecture for Matching Short Texts
Author: Zhengdong Lu, Hang Li
Abstract: Many machine learning problems can be interpreted as learning for matching two types of objects (e.g., images and captions, users and products, queries and documents, etc.). The matching level of two objects is usually measured as the inner product in a certain feature space, while the modeling effort focuses on mapping of objects from the original space to the feature space. This schema, although proven successful on a range of matching tasks, is insufficient for capturing the rich structure in the matching process of more complicated objects. In this paper, we propose a new deep architecture to more effectively model the complicated matching relations between two objects from heterogeneous domains. More specifically, we apply this model to matching tasks in natural language, e.g., finding sensible responses for a tweet, or relevant answers to a given question. This new architecture naturally combines the localness and hierarchy intrinsic to the natural language problems, and therefore greatly improves upon the state-of-the-art models. 1
6 nips-2013-A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data
Author: Jasper Snoek, Richard Zemel, Ryan P. Adams
Abstract: Point processes are popular models of neural spiking behavior as they provide a statistical distribution over temporal sequences of spikes and help to reveal the complexities underlying a series of recorded action potentials. However, the most common neural point process models, the Poisson process and the gamma renewal process, do not capture interactions and correlations that are critical to modeling populations of neurons. We develop a novel model based on a determinantal point process over latent embeddings of neurons that effectively captures and helps visualize complex inhibitory and competitive interaction. We show that this model is a natural extension of the popular generalized linear model to sets of interacting neurons. The model is extended to incorporate gain control or divisive normalization, and the modulation of neural spiking based on periodic phenomena. Applied to neural spike recordings from the rat hippocampus, we see that the model captures inhibitory relationships, a dichotomy of classes of neurons, and a periodic modulation by the theta rhythm known to be present in the data. 1
Author: Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella
Abstract: Multi-armed bandit problems formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such as online advertisement and, more generally, recommendation systems. In many cases, however, these applications have a strong social component, whose integration in the bandit algorithm could lead to a dramatic performance increase. For instance, content may be served to a group of users by taking advantage of an underlying network of social relationships among them. In this paper, we introduce novel algorithmic approaches to the solution of such networked bandit problems. More specifically, we design and analyze a global recommendation strategy which allocates a bandit algorithm to each network node (user) and allows it to “share” signals (contexts and payoffs) with the neghboring nodes. We then derive two more scalable variants of this strategy based on different ways of clustering the graph nodes. We experimentally compare the algorithm and its variants to state-of-the-art methods for contextual bandits that do not use the relational information. Our experiments, carried out on synthetic and real-world datasets, show a consistent increase in prediction performance obtained by exploiting the network structure. 1
Author: Jinwoo Shin, Andrew E. Gelfand, Misha Chertkov
Abstract: Max-product ‘belief propagation’ (BP) is a popular distributed heuristic for finding the Maximum A Posteriori (MAP) assignment in a joint probability distribution represented by a Graphical Model (GM). It was recently shown that BP converges to the correct MAP assignment for a class of loopy GMs with the following common feature: the Linear Programming (LP) relaxation to the MAP problem is tight (has no integrality gap). Unfortunately, tightness of the LP relaxation does not, in general, guarantee convergence and correctness of the BP algorithm. The failure of BP in such cases motivates reverse engineering a solution – namely, given a tight LP, can we design a ‘good’ BP algorithm. In this paper, we design a BP algorithm for the Maximum Weight Matching (MWM) problem over general graphs. We prove that the algorithm converges to the correct optimum if the respective LP relaxation, which may include inequalities associated with non-intersecting odd-sized cycles, is tight. The most significant part of our approach is the introduction of a novel graph transformation designed to force convergence of BP. Our theoretical result suggests an efficient BP-based heuristic for the MWM problem, which consists of making sequential, “cutting plane”, modifications to the underlying GM. Our experiments show that this heuristic performs as well as traditional cutting-plane algorithms using LP solvers on MWM problems. 1
9 nips-2013-A Kernel Test for Three-Variable Interactions
Author: Dino Sejdinovic, Arthur Gretton, Wicher Bergsma
Abstract: We introduce kernel nonparametric tests for Lancaster three-variable interaction and for total independence, using embeddings of signed measures into a reproducing kernel Hilbert space. The resulting test statistics are straightforward to compute, and are used in powerful interaction tests, which are consistent against all alternatives for a large family of reproducing kernels. We show the Lancaster test to be sensitive to cases where two independent causes individually have weak influence on a third dependent variable, but their combined effect has a strong influence. This makes the Lancaster test especially suited to finding structure in directed graphical models, where it outperforms competing nonparametric tests in detecting such V-structures.
10 nips-2013-A Latent Source Model for Nonparametric Time Series Classification
Author: George H. Chen, Stanislav Nikolov, Devavrat Shah
Abstract: For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren’t actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a “weighted majority voting” classification rule that can be approximated by a nearest-neighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearest-neighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearest-neighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such “trending topics” in advance of Twitter 79% of the time, with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95%, and a false positive rate of 4%. 1
11 nips-2013-A New Convex Relaxation for Tensor Completion
Author: Bernardino Romera-Paredes, Massimiliano Pontil
Abstract: We study the problem of learning a tensor from a set of linear measurements. A prominent methodology for this problem is based on a generalization of trace norm regularization, which has been used extensively for learning low rank matrices, to the tensor setting. In this paper, we highlight some limitations of this approach and propose an alternative convex relaxation on the Euclidean ball. We then describe a technique to solve the associated regularization problem, which builds upon the alternating direction method of multipliers. Experiments on one synthetic dataset and two real datasets indicate that the proposed method improves significantly over tensor trace norm regularization in terms of estimation error, while remaining computationally tractable. 1
12 nips-2013-A Novel Two-Step Method for Cross Language Representation Learning
Author: Min Xiao, Yuhong Guo
Abstract: Cross language text classification is an important learning task in natural language processing. A critical challenge of cross language learning arises from the fact that words of different languages are in disjoint feature spaces. In this paper, we propose a two-step representation learning method to bridge the feature spaces of different languages by exploiting a set of parallel bilingual documents. Specifically, we first formulate a matrix completion problem to produce a complete parallel document-term matrix for all documents in two languages, and then induce a low dimensional cross-lingual document representation by applying latent semantic indexing on the obtained matrix. We use a projected gradient descent algorithm to solve the formulated matrix completion problem with convergence guarantees. The proposed method is evaluated by conducting a set of experiments with cross language sentiment classification tasks on Amazon product reviews. The experimental results demonstrate that the proposed learning method outperforms a number of other cross language representation learning methods, especially when the number of parallel bilingual documents is small. 1
13 nips-2013-A Scalable Approach to Probabilistic Latent Space Inference of Large-Scale Networks
Author: Junming Yin, Qirong Ho, Eric Xing
Abstract: We propose a scalable approach for making inference about latent spaces of large networks. With a succinct representation of networks as a bag of triangular motifs, a parsimonious statistical model, and an efficient stochastic variational inference algorithm, we are able to analyze real networks with over a million vertices and hundreds of latent roles on a single machine in a matter of hours, a setting that is out of reach for many existing methods. When compared to the state-of-the-art probabilistic approaches, our method is several orders of magnitude faster, with competitive or improved accuracy for latent space recovery and link prediction. 1
14 nips-2013-A Stability-based Validation Procedure for Differentially Private Machine Learning
Author: Kamalika Chaudhuri, Staal A. Vinterbo
Abstract: Differential privacy is a cryptographically motivated definition of privacy which has gained considerable attention in the algorithms, machine-learning and datamining communities. While there has been an explosion of work on differentially private machine learning algorithms, a major barrier to achieving end-to-end differential privacy in practical machine learning applications is the lack of an effective procedure for differentially private parameter tuning, or, determining the parameter value, such as a bin size in a histogram, or a regularization parameter, that is suitable for a particular application. In this paper, we introduce a generic validation procedure for differentially private machine learning algorithms that apply when a certain stability condition holds on the training algorithm and the validation performance metric. The training data size and the privacy budget used for training in our procedure is independent of the number of parameter values searched over. We apply our generic procedure to two fundamental tasks in statistics and machine-learning – training a regularized linear classifier and building a histogram density estimator that result in end-toend differentially private solutions for these problems. 1
15 nips-2013-A memory frontier for complex synapses
Author: Subhaneil Lahiri, Surya Ganguli
Abstract: An incredible gulf separates theoretical models of synapses, often described solely by a single scalar value denoting the size of a postsynaptic potential, from the immense complexity of molecular signaling pathways underlying real synapses. To understand the functional contribution of such molecular complexity to learning and memory, it is essential to expand our theoretical conception of a synapse from a single scalar to an entire dynamical system with many internal molecular functional states. Moreover, theoretical considerations alone demand such an expansion; network models with scalar synapses assuming finite numbers of distinguishable synaptic strengths have strikingly limited memory capacity. This raises the fundamental question, how does synaptic complexity give rise to memory? To address this, we develop new mathematical theorems elucidating the relationship between the structural organization and memory properties of complex synapses that are themselves molecular networks. Moreover, in proving such theorems, we uncover a framework, based on first passage time theory, to impose an order on the internal states of complex synaptic models, thereby simplifying the relationship between synaptic structure and function. 1
16 nips-2013-A message-passing algorithm for multi-agent trajectory planning
Author: Jose Bento, Nate Derbinsky, Javier Alonso-Mora, Jonathan S. Yedidia
Abstract: We describe a novel approach for computing collision-free global trajectories for p agents with specified initial and final configurations, based on an improved version of the alternating direction method of multipliers (ADMM). Compared with existing methods, our approach is naturally parallelizable and allows for incorporating different cost functionals with only minor adjustments. We apply our method to classical challenging instances and observe that its computational requirements scale well with p for several cost functionals. We also show that a specialization of our algorithm can be used for local motion planning by solving the problem of joint optimization in velocity space. 1
17 nips-2013-A multi-agent control framework for co-adaptation in brain-computer interfaces
Author: Josh S. Merel, Roy Fox, Tony Jebara, Liam Paninski
Abstract: In a closed-loop brain-computer interface (BCI), adaptive decoders are used to learn parameters suited to decoding the user’s neural response. Feedback to the user provides information which permits the neural tuning to also adapt. We present an approach to model this process of co-adaptation between the encoding model of the neural signal and the decoding algorithm as a multi-agent formulation of the linear quadratic Gaussian (LQG) control problem. In simulation we characterize how decoding performance improves as the neural encoding and adaptive decoder optimize, qualitatively resembling experimentally demonstrated closed-loop improvement. We then propose a novel, modified decoder update rule which is aware of the fact that the encoder is also changing and show it can improve simulated co-adaptation dynamics. Our modeling approach offers promise for gaining insights into co-adaptation as well as improving user learning of BCI control in practical settings.
18 nips-2013-A simple example of Dirichlet process mixture inconsistency for the number of components
Author: Jeffrey W. Miller, Matthew T. Harrison
Abstract: For data assumed to come from a finite mixture with an unknown number of components, it has become common to use Dirichlet process mixtures (DPMs) not only for density estimation, but also for inferences about the number of components. The typical approach is to use the posterior distribution on the number of clusters — that is, the posterior on the number of components represented in the observed data. However, it turns out that this posterior is not consistent — it does not concentrate at the true number of components. In this note, we give an elementary proof of this inconsistency in what is perhaps the simplest possible setting: a DPM with normal components of unit variance, applied to data from a “mixture” with one standard normal component. Further, we show that this example exhibits severe inconsistency: instead of going to 1, the posterior probability that there is one cluster converges (in probability) to 0. 1
19 nips-2013-Accelerated Mini-Batch Stochastic Dual Coordinate Ascent
Author: Shai Shalev-Shwartz, Tong Zhang
Abstract: Stochastic dual coordinate ascent (SDCA) is an effective technique for solving regularized loss minimization problems in machine learning. This paper considers an extension of SDCA under the mini-batch setting that is often used in practice. Our main contribution is to introduce an accelerated mini-batch version of SDCA and prove a fast convergence rate for this method. We discuss an implementation of our method over a parallel computing system, and compare the results to both the vanilla stochastic dual coordinate ascent and to the accelerated deterministic gradient descent method of Nesterov [2007]. 1
20 nips-2013-Accelerating Stochastic Gradient Descent using Predictive Variance Reduction
Author: Rie Johnson, Tong Zhang
Abstract: Stochastic gradient descent is popular for large scale optimization but has slow convergence asymptotically due to the inherent variance. To remedy this problem, we introduce an explicit variance reduction method for stochastic gradient descent which we call stochastic variance reduced gradient (SVRG). For smooth and strongly convex functions, we prove that this method enjoys the same fast convergence rate as those of stochastic dual coordinate ascent (SDCA) and Stochastic Average Gradient (SAG). However, our analysis is significantly simpler and more intuitive. Moreover, unlike SDCA or SAG, our method does not require the storage of gradients, and thus is more easily applicable to complex problems such as some structured prediction problems and neural network learning. 1
23 nips-2013-Active Learning for Probabilistic Hypotheses Using the Maximum Gibbs Error Criterion
24 nips-2013-Actor-Critic Algorithms for Risk-Sensitive MDPs
25 nips-2013-Adaptive Anonymity via $b$-Matching
26 nips-2013-Adaptive Market Making via Online Learning
27 nips-2013-Adaptive Multi-Column Deep Neural Networks with Application to Robust Image Denoising
28 nips-2013-Adaptive Step-Size for Policy Gradient Methods
29 nips-2013-Adaptive Submodular Maximization in Bandit Setting
30 nips-2013-Adaptive dropout for training deep neural networks
31 nips-2013-Adaptivity to Local Smoothness and Dimension in Kernel Regression
32 nips-2013-Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
33 nips-2013-An Approximate, Efficient LP Solver for LP Rounding
34 nips-2013-Analyzing Hogwild Parallel Gaussian Gibbs Sampling
35 nips-2013-Analyzing the Harmonic Structure in Graph-Based Learning
36 nips-2013-Annealing between distributions by averaging moments
37 nips-2013-Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
38 nips-2013-Approximate Dynamic Programming Finally Performs Well in the Game of Tetris
40 nips-2013-Approximate Inference in Continuous Determinantal Processes
41 nips-2013-Approximate inference in latent Gaussian-Markov models from continuous time observations
42 nips-2013-Auditing: Active Learning with Outcome-Dependent Query Costs
43 nips-2013-Auxiliary-variable Exact Hamiltonian Monte Carlo Samplers for Binary Distributions
44 nips-2013-B-test: A Non-parametric, Low Variance Kernel Two-sample Test
45 nips-2013-BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables
46 nips-2013-Bayesian Estimation of Latently-grouped Parameters in Undirected Graphical Models
47 nips-2013-Bayesian Hierarchical Community Discovery
48 nips-2013-Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC
49 nips-2013-Bayesian Inference and Online Experimental Design for Mapping Neural Microcircuits
51 nips-2013-Bayesian entropy estimation for binary spike train data using parametric prior knowledge
53 nips-2013-Bayesian inference for low rank spatiotemporal neural receptive fields
54 nips-2013-Bayesian optimization explains human active search
55 nips-2013-Bellman Error Based Feature Generation using Random Projections on Sparse Spaces
56 nips-2013-Better Approximation and Faster Algorithm Using the Proximal Average
57 nips-2013-Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search
58 nips-2013-Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent
59 nips-2013-Blind Calibration in Compressed Sensing using Message Passing Algorithms
60 nips-2013-Buy-in-Bulk Active Learning
61 nips-2013-Capacity of strong attractor patterns to model behavioural and cognitive prototypes
62 nips-2013-Causal Inference on Time Series using Restricted Structural Equation Models
63 nips-2013-Cluster Trees on Manifolds
64 nips-2013-Compete to Compute
65 nips-2013-Compressive Feature Learning
66 nips-2013-Computing the Stationary Distribution Locally
67 nips-2013-Conditional Random Fields via Univariate Exponential Families
68 nips-2013-Confidence Intervals and Hypothesis Testing for High-Dimensional Statistical Models
69 nips-2013-Context-sensitive active sensing in humans
70 nips-2013-Contrastive Learning Using Spectral Methods
71 nips-2013-Convergence of Monte Carlo Tree Search in Simultaneous Move Games
73 nips-2013-Convex Relaxations for Permutation Problems
74 nips-2013-Convex Tensor Decomposition via Structured Schatten Norm Regularization
75 nips-2013-Convex Two-Layer Modeling
76 nips-2013-Correlated random features for fast semi-supervised learning
77 nips-2013-Correlations strike back (again): the case of associative memory retrieval
78 nips-2013-Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions
79 nips-2013-DESPOT: Online POMDP Planning with Regularization
80 nips-2013-Data-driven Distributionally Robust Polynomial Optimization
81 nips-2013-DeViSE: A Deep Visual-Semantic Embedding Model
82 nips-2013-Decision Jungles: Compact and Rich Models for Classification
83 nips-2013-Deep Fisher Networks for Large-Scale Image Classification
84 nips-2013-Deep Neural Networks for Object Detection
85 nips-2013-Deep content-based music recommendation
86 nips-2013-Demixing odors - fast inference in olfaction
87 nips-2013-Density estimation from unweighted k-nearest neighbor graphs: a roadmap
88 nips-2013-Designed Measurements for Vector Count Data
89 nips-2013-Dimension-Free Exponentiated Gradient
90 nips-2013-Direct 0-1 Loss Minimization and Margin Maximization with Boosting
91 nips-2013-Dirty Statistical Models
92 nips-2013-Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests
93 nips-2013-Discriminative Transfer Learning with Tree-based Priors
94 nips-2013-Distributed $k$-means and $k$-median Clustering on General Topologies
95 nips-2013-Distributed Exploration in Multi-Armed Bandits
96 nips-2013-Distributed Representations of Words and Phrases and their Compositionality
97 nips-2013-Distributed Submodular Maximization: Identifying Representative Elements in Massive Data
98 nips-2013-Documents as multiple overlapping windows into grids of counts
99 nips-2013-Dropout Training as Adaptive Regularization
100 nips-2013-Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture
101 nips-2013-EDML for Learning Parameters in Directed and Undirected Graphical Models
102 nips-2013-Efficient Algorithm for Privately Releasing Smooth Queries
103 nips-2013-Efficient Exploration and Value Function Generalization in Deterministic Systems
104 nips-2013-Efficient Online Inference for Bayesian Nonparametric Relational Models
105 nips-2013-Efficient Optimization for Sparse Gaussian Process Regression
106 nips-2013-Eluder Dimension and the Sample Complexity of Optimistic Exploration
107 nips-2013-Embed and Project: Discrete Sampling with Universal Hashing
109 nips-2013-Estimating LASSO Risk and Noise Level
110 nips-2013-Estimating the Unseen: Improved Estimators for Entropy and other Properties
111 nips-2013-Estimation, Optimization, and Parallelism when Data is Sparse
112 nips-2013-Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising
113 nips-2013-Exact and Stable Recovery of Pairwise Interaction Tensors
115 nips-2013-Factorized Asymptotic Bayesian Inference for Latent Feature Models
116 nips-2013-Fantope Projection and Selection: A near-optimal convex relaxation of sparse PCA
117 nips-2013-Fast Algorithms for Gaussian Noise Invariant Independent Component Analysis
118 nips-2013-Fast Determinantal Point Process Sampling with Application to Clustering
119 nips-2013-Fast Template Evaluation with Vector Quantization
120 nips-2013-Faster Ridge Regression via the Subsampled Randomized Hadamard Transform
121 nips-2013-Firing rate predictions in optimal balanced networks
122 nips-2013-First-order Decomposition Trees
123 nips-2013-Flexible sampling of discrete data correlations without the marginal distributions
125 nips-2013-From Bandits to Experts: A Tale of Domination and Independence
126 nips-2013-Gaussian Process Conditional Copulas with Applications to Financial Time Series
127 nips-2013-Generalized Denoising Auto-Encoders as Generative Models
128 nips-2013-Generalized Method-of-Moments for Rank Aggregation
129 nips-2013-Generalized Random Utility Models with Multiple Types
130 nips-2013-Generalizing Analytic Shrinkage for Arbitrary Covariance Structures
132 nips-2013-Global MAP-Optimality by Shrinking the Combinatorial Search Area with Convex Relaxation
134 nips-2013-Graphical Models for Inference with Missing Data
135 nips-2013-Heterogeneous-Neighborhood-based Multi-Task Local Learning Algorithms
137 nips-2013-High-Dimensional Gaussian Process Bandits
138 nips-2013-Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation
139 nips-2013-How to Hedge an Option Against an Adversary: Black-Scholes Pricing is Minimax Optimal
140 nips-2013-Improved and Generalized Upper Bounds on the Complexity of Policy Iteration
143 nips-2013-Integrated Non-Factorized Variational Inference
144 nips-2013-Inverse Density as an Inverse Problem: the Fredholm Equation Approach
146 nips-2013-Large Scale Distributed Sparse Precision Estimation
147 nips-2013-Lasso Screening Rules via Dual Polytope Projection
148 nips-2013-Latent Maximum Margin Clustering
149 nips-2013-Latent Structured Active Learning
150 nips-2013-Learning Adaptive Value of Information for Structured Prediction
151 nips-2013-Learning Chordal Markov Networks by Constraint Satisfaction
153 nips-2013-Learning Feature Selection Dependencies in Multi-task Learning
154 nips-2013-Learning Gaussian Graphical Models with Observed or Latent FVSs
155 nips-2013-Learning Hidden Markov Models from Non-sequence Data via Tensor Decomposition
156 nips-2013-Learning Kernels Using Local Rademacher Complexity
157 nips-2013-Learning Multi-level Sparse Representations
158 nips-2013-Learning Multiple Models via Regularized Weighting
159 nips-2013-Learning Prices for Repeated Auctions with Strategic Buyers
160 nips-2013-Learning Stochastic Feedforward Neural Networks
161 nips-2013-Learning Stochastic Inverses
162 nips-2013-Learning Trajectory Preferences for Manipulators via Iterative Improvement
163 nips-2013-Learning a Deep Compact Image Representation for Visual Tracking
164 nips-2013-Learning and using language via recursive pragmatic reasoning about other agents
165 nips-2013-Learning from Limited Demonstrations
166 nips-2013-Learning invariant representations and applications to face verification
167 nips-2013-Learning the Local Statistics of Optical Flow
168 nips-2013-Learning to Pass Expectation Propagation Messages
169 nips-2013-Learning to Prune in Metric and Non-Metric Spaces
170 nips-2013-Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space
171 nips-2013-Learning with Noisy Labels
172 nips-2013-Learning word embeddings efficiently with noise-contrastive estimation
173 nips-2013-Least Informative Dimensions
174 nips-2013-Lexical and Hierarchical Topic Regression
175 nips-2013-Linear Convergence with Condition Number Independent Access of Full Gradients
176 nips-2013-Linear decision rule as aspiration for simple decision heuristics
177 nips-2013-Local Privacy and Minimax Bounds: Sharp Rates for Probability Estimation
178 nips-2013-Locally Adaptive Bayesian Multivariate Time Series
179 nips-2013-Low-Rank Matrix and Tensor Completion via Adaptive Sampling
180 nips-2013-Low-rank matrix reconstruction and clustering via approximate message passing
181 nips-2013-Machine Teaching for Bayesian Learners in the Exponential Family
182 nips-2013-Manifold-based Similarity Adaptation for Label Propagation
183 nips-2013-Mapping paradigm ontologies to and from the brain
184 nips-2013-Marginals-to-Models Reducibility
185 nips-2013-Matrix Completion From any Given Set of Observations
186 nips-2013-Matrix factorization with binary components
187 nips-2013-Memoized Online Variational Inference for Dirichlet Process Mixture Models
188 nips-2013-Memory Limited, Streaming PCA
189 nips-2013-Message Passing Inference with Chemical Reaction Networks
190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
191 nips-2013-Minimax Optimal Algorithms for Unconstrained Linear Optimization
192 nips-2013-Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation
193 nips-2013-Mixed Optimization for Smooth Functions
195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning
196 nips-2013-Modeling Overlapping Communities with Node Popularities
197 nips-2013-Moment-based Uniform Deviation Bounds for $k$-means and Friends
198 nips-2013-More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server
199 nips-2013-More data speeds up training time in learning halfspaces over sparse vectors
200 nips-2013-Multi-Prediction Deep Boltzmann Machines
201 nips-2013-Multi-Task Bayesian Optimization
202 nips-2013-Multiclass Total Variation Clustering
203 nips-2013-Multilinear Dynamical Systems for Tensor Time Series
204 nips-2013-Multiscale Dictionary Learning for Estimating Conditional Distributions
205 nips-2013-Multisensory Encoding, Decoding, and Identification
206 nips-2013-Near-Optimal Entrywise Sampling for Data Matrices
207 nips-2013-Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic
209 nips-2013-New Subsampling Algorithms for Fast Least Squares Regression
210 nips-2013-Noise-Enhanced Associative Memories
211 nips-2013-Non-Linear Domain Adaptation with Boosting
212 nips-2013-Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty
213 nips-2013-Nonparametric Multi-group Membership Model for Dynamic Networks
214 nips-2013-On Algorithms for Sparse Multi-factor NMF
215 nips-2013-On Decomposing the Proximal Map
216 nips-2013-On Flat versus Hierarchical Classification in Large-Scale Taxonomies
217 nips-2013-On Poisson Graphical Models
218 nips-2013-On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori Perturbations
219 nips-2013-On model selection consistency of penalized M-estimators: a geometric theory
220 nips-2013-On the Complexity and Approximation of Binary Evidence in Lifted Inference
221 nips-2013-On the Expressive Power of Restricted Boltzmann Machines
222 nips-2013-On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization
224 nips-2013-On the Sample Complexity of Subspace Learning
225 nips-2013-One-shot learning and big data with n=2
226 nips-2013-One-shot learning by inverting a compositional causal process
228 nips-2013-Online Learning of Dynamic Parameters in Social Networks
229 nips-2013-Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation
230 nips-2013-Online Learning with Costly Features and Labels
231 nips-2013-Online Learning with Switching Costs and Other Adaptive Adversaries
232 nips-2013-Online PCA for Contaminated Data
233 nips-2013-Online Robust PCA via Stochastic Optimization
235 nips-2013-Online learning in episodic Markovian decision processes by relative entropy policy search
236 nips-2013-Optimal Neural Population Codes for High-dimensional Stimulus Variables
237 nips-2013-Optimal integration of visual speed across different spatiotemporal frequency channels
238 nips-2013-Optimistic Concurrency Control for Distributed Unsupervised Learning
240 nips-2013-Optimization, Learning, and Games with Predictable Sequences
241 nips-2013-Optimizing Instructional Policies
242 nips-2013-PAC-Bayes-Empirical-Bernstein Inequality
243 nips-2013-Parallel Sampling of DP Mixture Models using Sub-Cluster Splits
244 nips-2013-Parametric Task Learning
245 nips-2013-Pass-efficient unsupervised feature selection
246 nips-2013-Perfect Associative Learning with Spike-Timing-Dependent Plasticity
247 nips-2013-Phase Retrieval using Alternating Minimization
248 nips-2013-Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs
249 nips-2013-Polar Operators for Structured Sparse Estimation
250 nips-2013-Policy Shaping: Integrating Human Feedback with Reinforcement Learning
251 nips-2013-Predicting Parameters in Deep Learning
252 nips-2013-Predictive PAC Learning and Process Decompositions
253 nips-2013-Prior-free and prior-dependent regret bounds for Thompson Sampling
254 nips-2013-Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
255 nips-2013-Probabilistic Movement Primitives
256 nips-2013-Probabilistic Principal Geodesic Analysis
257 nips-2013-Projected Natural Actor-Critic
258 nips-2013-Projecting Ising Model Parameters for Fast Mixing
259 nips-2013-Provable Subspace Clustering: When LRR meets SSC
260 nips-2013-RNADE: The real-valued neural autoregressive density-estimator
261 nips-2013-Rapid Distance-Based Outlier Detection via Sampling
262 nips-2013-Real-Time Inference for a Gamma Process Model of Neural Spiking
263 nips-2013-Reasoning With Neural Tensor Networks for Knowledge Base Completion
265 nips-2013-Reconciling "priors" & "priors" without prejudice?
266 nips-2013-Recurrent linear models of simultaneously-recorded neural populations
268 nips-2013-Reflection methods for user-friendly submodular optimization
269 nips-2013-Regression-tree Tuning in a Streaming Setting
270 nips-2013-Regret based Robust Solutions for Uncertain Markov Decision Processes
272 nips-2013-Regularized Spectral Clustering under the Degree-Corrected Stochastic Blockmodel
273 nips-2013-Reinforcement Learning in Robust Markov Decision Processes
274 nips-2013-Relevance Topic Model for Unstructured Social Group Activity Recognition
275 nips-2013-Reservoir Boosting : Between Online and Offline Ensemble Learning
276 nips-2013-Reshaping Visual Datasets for Domain Adaptation
277 nips-2013-Restricting exchangeable nonparametric distributions
278 nips-2013-Reward Mapping for Transfer in Long-Lived Agents
279 nips-2013-Robust Bloom Filters for Large MultiLabel Classification Tasks
280 nips-2013-Robust Data-Driven Dynamic Programming
281 nips-2013-Robust Low Rank Kernel Embeddings of Multivariate Distributions
282 nips-2013-Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching
283 nips-2013-Robust Sparse Principal Component Regression under the High Dimensional Elliptical Model
284 nips-2013-Robust Spatial Filtering with Beta Divergence
285 nips-2013-Robust Transfer Principal Component Analysis with Rank Constraints
286 nips-2013-Robust learning of low-dimensional dynamics from large neural ensembles
287 nips-2013-Scalable Inference for Logistic-Normal Topic Models
288 nips-2013-Scalable Influence Estimation in Continuous-Time Diffusion Networks
289 nips-2013-Scalable kernels for graphs with continuous attributes
290 nips-2013-Scoring Workers in Crowdsourcing: How Many Control Questions are Enough?
291 nips-2013-Sensor Selection in High-Dimensional Gaussian Trees with Nuisances
292 nips-2013-Sequential Transfer in Multi-armed Bandit with Finite Set of Models
293 nips-2013-Sign Cauchy Projections and Chi-Square Kernel
294 nips-2013-Similarity Component Analysis
295 nips-2013-Simultaneous Rectification and Alignment via Robust Recovery of Low-rank Tensors
296 nips-2013-Sinkhorn Distances: Lightspeed Computation of Optimal Transport
297 nips-2013-Sketching Structured Matrices for Faster Nonlinear Regression
298 nips-2013-Small-Variance Asymptotics for Hidden Markov Models
299 nips-2013-Solving inverse problem of Markov chain with partial observations
300 nips-2013-Solving the multi-way matching problem by permutation synchronization
301 nips-2013-Sparse Additive Text Models with Low Rank Background
302 nips-2013-Sparse Inverse Covariance Estimation with Calibration
303 nips-2013-Sparse Overlapping Sets Lasso for Multitask Learning and its Application to fMRI Analysis
305 nips-2013-Spectral methods for neural characterization using generalized quadratic models
306 nips-2013-Speeding up Permutation Testing in Neuroimaging
307 nips-2013-Speedup Matrix Completion with Side Information: Application to Multi-Label Learning
308 nips-2013-Spike train entropy-rate estimation using hierarchical Dirichlet process priors
309 nips-2013-Statistical Active Learning Algorithms
310 nips-2013-Statistical analysis of coupled time series with Kernel Cross-Spectral Density operators.
311 nips-2013-Stochastic Convex Optimization with Multiple Objectives
312 nips-2013-Stochastic Gradient Riemannian Langevin Dynamics on the Probability Simplex
313 nips-2013-Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization
314 nips-2013-Stochastic Optimization of PCA with Capped MSG
315 nips-2013-Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs
316 nips-2013-Stochastic blockmodel approximation of a graphon: Theory and consistent estimation
317 nips-2013-Streaming Variational Bayes
318 nips-2013-Structured Learning via Logistic Regression
319 nips-2013-Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints
320 nips-2013-Summary Statistics for Partitionings and Feature Allocations
321 nips-2013-Supervised Sparse Analysis and Synthesis Operators
322 nips-2013-Symbolic Opportunistic Policy Iteration for Factored-Action MDPs
323 nips-2013-Synthesizing Robust Plans under Incomplete Domain Models
324 nips-2013-The Fast Convergence of Incremental PCA
325 nips-2013-The Pareto Regret Frontier
326 nips-2013-The Power of Asymmetry in Binary Hashing
327 nips-2013-The Randomized Dependence Coefficient
328 nips-2013-The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited
329 nips-2013-Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections
330 nips-2013-Thompson Sampling for 1-Dimensional Exponential Family Bandits
331 nips-2013-Top-Down Regularization of Deep Belief Networks
332 nips-2013-Tracking Time-varying Graphical Structure
333 nips-2013-Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent
334 nips-2013-Training and Analysing Deep Recurrent Neural Networks
335 nips-2013-Transfer Learning in a Transductive Setting
336 nips-2013-Translating Embeddings for Modeling Multi-relational Data
337 nips-2013-Transportability from Multiple Environments with Limited Experiments
338 nips-2013-Two-Target Algorithms for Infinite-Armed Bandits with Bernoulli Rewards
339 nips-2013-Understanding Dropout
340 nips-2013-Understanding variable importances in forests of randomized trees
341 nips-2013-Universal models for binary spike patterns using centered Dirichlet processes
342 nips-2013-Unsupervised Spectral Learning of Finite State Transducers
343 nips-2013-Unsupervised Structure Learning of Stochastic And-Or Grammars
344 nips-2013-Using multiple samples to learn mixture models
345 nips-2013-Variance Reduction for Stochastic Gradient Optimization
346 nips-2013-Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression
347 nips-2013-Variational Planning for Graph-based MDPs
348 nips-2013-Variational Policy Search via Trajectory Optimization
350 nips-2013-Wavelets on Graphs via Deep Learning
352 nips-2013-What do row and column marginals reveal about your dataset?
354 nips-2013-When in Doubt, SWAP: High-Dimensional Sparse Recovery from Correlated Measurements
355 nips-2013-Which Space Partitioning Tree to Use for Search?
356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer
357 nips-2013-k-Prototype Learning for 3D Rigid Structures
358 nips-2013-q-OCSVM: A q-Quantile Estimator for High-Dimensional Distributions
359 nips-2013-Σ-Optimality for Active Learning on Gaussian Random Fields