jmlr jmlr2013 jmlr2013-45 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, Aki Vehtari
Abstract: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods. Keywords: Gaussian process, Bayesian hierarchical model, nonparametric Bayes
Reference: text
sentIndex sentText sentNum sentScore
1 Box 65 FI-00014 Helsinki, Finland Jaakko Riihim¨ ki a Jouni Hartikainen Pasi Jyl¨ nki a Ville Tolvanen Aki Vehtari JAAKKO . [sent-5, score-0.205]
2 Box 12200 FI-00076 Aalto, Finland Editor: Balazs Kegl Abstract The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. [sent-17, score-0.176]
3 The tools include, among others, various inference methods, sparse approximations and model assessment methods. [sent-18, score-0.169]
4 Introduction Gaussian process (GP) prior provides a flexible building block for many hierarchical Bayesian models (Rasmussen and Williams, 2006). [sent-20, score-0.037]
5 1) is a versatile collection of computational tools for GP models and it has already been used in several published projects, for example, in epidemiology, species distribution modeling and building energy usage modeling (see Vanhatalo et al. [sent-22, score-0.125]
6 GPstuff combines models and inference tools in a modular format. [sent-24, score-0.112]
7 It also provides various sparse GP models and methods for model assessment. [sent-25, score-0.081]
8 The toolbox is compatible with Unix and Windows Matlab (at least r2009b or later). [sent-26, score-0.051]
9 , yn ]T related to inputs (covariates) X = {xi = [xi,1 , . [sent-38, score-0.041]
10 xi,d ]T }n are assumed to be conditionally independent given a latent function (or prei=1 dictor) f (x) so that the likelihood p(y|f, γ) = ∏n p(yi | fi , γ), where f = [ f (x1 ), . [sent-41, score-0.145]
11 The latent function is given a GP prior, f ∼ GP(m(x|φ), k(x, x′ |θ)) which is defined by the mean and covariance function, m(x|φ) and k(x, x′ |θ) respectively. [sent-48, score-0.114]
12 The parameters, ϑ = {γ, φ, θ}, are given a hyperprior after which the posterior p(f|y, X) is approximated and used for prediction. [sent-49, score-0.051]
13 Most of the models in GPstuff follow the above single latent dependency, but there are also models where each factor depends on multiple latent values. [sent-50, score-0.17]
14 We illustrate the construction and inference of a GP model with a regression example. [sent-51, score-0.049]
15 First, we assume yi = f (xi ) + εi , εi ∼ N(0, σ2 ), and give f (x) a GP prior with a squared exponential covariance function, k(x, x′ ) = σ2 exp(||x − x′ ||2 /2l 2 ). [sent-52, score-0.066]
16 function gp = gp_set(’lik’, lik, ’cf’, gpcf); % init. [sent-58, score-0.389]
17 The structures lik and gpcf contain all the essential information about the likelihood and covariance function such as parameter values and function handles to construct a covariance matrix and its gradient with respect to the parameters. [sent-60, score-0.483]
18 All the model blocks are collected into a GP structure constructed by gp set. [sent-61, score-0.433]
19 The first assumes a Gaussian observation model which enables an analytic solution for the marginal likelihood p(y|X, ϑ) and the conditional posterior p(f|X, y, ϑ). [sent-63, score-0.107]
20 Using the relation p(ϑ|y, X) ∝ p(y|X, ϑ)p(ϑ) the parameters, ϑ, can be optimized to the maximum a posterior (MAP) estimate or marginalized over with grid, central composite design (CCD), importance sampling (IS) or Markov chain Monte Carlo (MCMC) integration (Vanhatalo et al. [sent-64, score-0.083]
21 With other observation models the marginal likelihood and the conditional posterior have to be approximated either with Laplace’s method (LA) or expectation propagation (EP) (Rasmussen and Williams, 2006). [sent-66, score-0.144]
22 An alternative approach is to sample from the joint posterior p(f, ϑ|X, y) with MCMC by alternating sampling from p(f|X, y, ϑ) and p(ϑ|X, y, f). [sent-67, score-0.051]
23 Above, gp optim returns a redefined model structure with parameter values optimized to their MAP estimate. [sent-68, score-0.389]
24 gp pred returns the conditional posterior predictive mean, E[ f |y, X, ϑ] and variance Var[ f |y, X, ϑ] at the test inputs. [sent-70, score-0.44]
25 Many sparse GPs have been proposed to speed up the computations with large data sets. [sent-71, score-0.044]
26 GPstuff includes FI(T)C, PIC, SOR, DTC (Qui˜ onero-Candela and Rasmussen, 2005), VAR (Titsias, 2009), n CS+FIC (Vanhatalo and Vehtari, 2008) sparse approximations, and several compactly supported (CS) covariance functions. [sent-72, score-0.11]
27 gpcf2 = gpcf_ppcs2(’nin’, nin, ’lengthScale’, 5, ’magnSigma2’, 1); gp = gp_set(’type’,’CS+FIC’,’lik’,lik,’cf’,{gpcf,gpcf2},’X_u’,Xu) In the first line, a CS covariance function, piecewise polynomial of second order, is created. [sent-74, score-0.455]
28 It is then given to the GP structure together with inducing inputs (Xu) and sparse GP type definition. [sent-75, score-0.111]
29 We can tailor the above model, for example, by replacing the Gaussian observation model with a more robust Student-t observation model (Jyl¨ nki et al. [sent-76, score-0.137]
30 GPstuff has wide variety of observation models (see Table 1) of which we want to highlight implementations of recently proposed multinomial probit with EP (Riihim¨ ki et al. [sent-79, score-0.224]
31 , 2013) and logistic a GP density estimation and regression with Laplace approximation (Riihim¨ ki and Vehtari, 2012). [sent-80, score-0.068]
32 a The constructed models could be compared, for example, with deviance information criterion (DIC), widely applicable information criterion (WAIC), leave-one-out or k-fold cross-validation (LOO/kf-CV) (Vehtari and Ojanen, 2012) with functions gp dic, gp waic, gp loopred and gp kfcv. [sent-81, score-1.593]
33 New models can be implemented by modifying the existing model blocks, such as covariance functions. [sent-82, score-0.103]
34 Adding new inference methods is more laborious since they require summaries from model blocks which may not be provided by the current version of GPstuff. [sent-83, score-0.093]
35 Related Software Perhaps the best known GP software packages are the Gaussian processes for Machine Learning (GPML) (Rasmussen and Nickisch, 2010) and the flexible Bayesian modelling (FBM) (Neal, 1998). [sent-87, score-0.065]
36 Overviews of alternatives are provided by the Gaussian processes website (http://www. [sent-88, score-0.034]
37 The main advantage of GPstuff over the other GP software is its versatile collection of models and computational tools. [sent-93, score-0.13]
38 In addition, the implementation of sparse matrix routines, used with the CS covariance functions, rely on the SuiteSparse toolbox (Davis, 2005). [sent-100, score-0.161]
39 Pieces of code have been written by other people than us. [sent-102, score-0.024]
40 We thank them all for sharing their code under a free software license. [sent-130, score-0.031]
41 In case of model blocks the notation x means that it can be inferred with any inference method (EP, LA (Laplace), MCMC and in case of GPML also with VB). [sent-136, score-0.093]
42 In case of sparse approximations, inference methods and model assessment methods x means that the method is available for all model blocks. [sent-137, score-0.143]
43 A unifying view of sparse approximate n Gaussian process regression. [sent-167, score-0.044]
44 Nested expectation propagation for Gaussian proa a cess classification with a multinomial probit likelihood. [sent-185, score-0.119]
45 Approximate Bayesian inference for latent Gausa sian models by using integrated nested Laplace approximations. [sent-188, score-0.16]
46 Variational learning of inducing variables in sparse Gaussian processes. [sent-192, score-0.07]
47 Modelling local and global phenomena with sparse Gaussian processes. [sent-195, score-0.044]
48 Approximate inference for disease mapping a with sparse Gaussian processes. [sent-199, score-0.093]
49 Bayesian modeling with Gaussian processes using the GPstuff toolbox. [sent-202, score-0.034]
wordName wordTfidf (topN-words)
[('gp', 0.389), ('gpstuff', 0.32), ('vanhatalo', 0.297), ('aki', 0.247), ('lik', 0.214), ('aalto', 0.206), ('jarno', 0.187), ('riihim', 0.16), ('jyl', 0.137), ('nki', 0.137), ('pasi', 0.137), ('vehtari', 0.137), ('ep', 0.136), ('jaakko', 0.123), ('mcmc', 0.12), ('ville', 0.114), ('gpml', 0.114), ('fbm', 0.107), ('gpcf', 0.107), ('hartikainen', 0.107), ('jouni', 0.107), ('tolvanen', 0.107), ('fic', 0.103), ('hmc', 0.091), ('sls', 0.091), ('rasmussen', 0.087), ('dic', 0.08), ('multinomial', 0.069), ('waic', 0.069), ('ki', 0.068), ('fi', 0.067), ('covariance', 0.066), ('versatile', 0.062), ('cs', 0.061), ('laplace', 0.058), ('bayesian', 0.058), ('cf', 0.057), ('gaussian', 0.055), ('artikainen', 0.053), ('becs', 0.053), ('ccd', 0.053), ('dtc', 0.053), ('masking', 0.053), ('nabney', 0.053), ('netlab', 0.053), ('olvanen', 0.053), ('pietil', 0.053), ('stuff', 0.053), ('weibull', 0.053), ('toolbox', 0.051), ('binomial', 0.051), ('posterior', 0.051), ('probit', 0.05), ('edward', 0.05), ('assessment', 0.05), ('inference', 0.049), ('latent', 0.048), ('helsinki', 0.048), ('sor', 0.046), ('anki', 0.046), ('ehtari', 0.046), ('iihim', 0.046), ('inen', 0.046), ('odeling', 0.046), ('pic', 0.046), ('carl', 0.045), ('finland', 0.045), ('sparse', 0.044), ('blocks', 0.044), ('inputs', 0.041), ('qui', 0.041), ('cox', 0.041), ('nin', 0.041), ('rocesses', 0.041), ('models', 0.037), ('lengthscale', 0.035), ('metropolis', 0.035), ('slice', 0.035), ('aussian', 0.035), ('la', 0.035), ('processes', 0.034), ('opt', 0.033), ('marginalized', 0.032), ('ml', 0.032), ('davis', 0.032), ('software', 0.031), ('yl', 0.03), ('vb', 0.03), ('likelihood', 0.03), ('cv', 0.029), ('rue', 0.028), ('neal', 0.028), ('williams', 0.027), ('priors', 0.027), ('tools', 0.026), ('var', 0.026), ('marginal', 0.026), ('inducing', 0.026), ('nested', 0.026), ('people', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000005 45 jmlr-2013-GPstuff: Bayesian Modeling with Gaussian Processes
Author: Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, Aki Vehtari
Abstract: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods. Keywords: Gaussian process, Bayesian hierarchical model, nonparametric Bayes
2 0.23465608 75 jmlr-2013-Nested Expectation Propagation for Gaussian Process Classification with a Multinomial Probit Likelihood
Author: Jaakko Riihimäki, Pasi Jylänki, Aki Vehtari
Abstract: This paper considers probabilistic multinomial probit classification using Gaussian process (GP) priors. Challenges with multiclass GP classification are the integration over the non-Gaussian posterior distribution, and the increase of the number of unknown latent variables as the number of target classes grows. Expectation propagation (EP) has proven to be a very accurate method for approximate inference but the existing EP approaches for the multinomial probit GP classification rely on numerical quadratures, or independence assumptions between the latent values associated with different classes, to facilitate the computations. In this paper we propose a novel nested EP approach which does not require numerical quadratures, and approximates accurately all betweenclass posterior dependencies of the latent values, but still scales linearly in the number of classes. The predictive accuracy of the nested EP approach is compared to Laplace, variational Bayes, and Markov chain Monte Carlo (MCMC) approximations with various benchmark data sets. In the experiments nested EP was the most consistent method compared to MCMC sampling, but in terms of classification accuracy the differences between all the methods were small from a practical point of view. Keywords: Gaussian process, multiclass classification, multinomial probit, approximate inference, expectation propagation
3 0.1264182 47 jmlr-2013-Gaussian Kullback-Leibler Approximate Inference
Author: Edward Challis, David Barber
Abstract: We investigate Gaussian Kullback-Leibler (G-KL) variational approximate inference techniques for Bayesian generalised linear models and various extensions. In particular we make the following novel contributions: sufficient conditions for which the G-KL objective is differentiable and convex are described; constrained parameterisations of Gaussian covariance that make G-KL methods fast and scalable are provided; the lower bound to the normalisation constant provided by G-KL methods is proven to dominate those provided by local lower bounding methods; complexity and model applicability issues of G-KL versus other Gaussian approximate inference methods are discussed. Numerical results comparing G-KL and other deterministic Gaussian approximate inference methods are presented for: robust Gaussian process regression models with either Student-t or Laplace likelihoods, large scale Bayesian binary logistic regression models, and Bayesian sparse linear models for sequential experimental design. Keywords: generalised linear models, latent linear models, variational approximate inference, large scale inference, sparse learning, experimental design, active learning, Gaussian processes
4 0.10839439 88 jmlr-2013-Perturbative Corrections for Approximate Inference in Gaussian Latent Variable Models
Author: Manfred Opper, Ulrich Paquet, Ole Winther
Abstract: Expectation Propagation (EP) provides a framework for approximate inference. When the model under consideration is over a latent Gaussian field, with the approximation being Gaussian, we show how these approximations can systematically be corrected. A perturbative expansion is made of the exact but intractable correction, and can be applied to the model’s partition function and other moments of interest. The correction is expressed over the higher-order cumulants which are neglected by EP’s local matching of moments. Through the expansion, we see that EP is correct to first order. By considering higher orders, corrections of increasing polynomial complexity can be applied to the approximation. The second order provides a correction in quadratic time, which we apply to an array of Gaussian process and Ising models. The corrections generalize to arbitrarily complex approximating families, which we illustrate on tree-structured Ising model approximations. Furthermore, they provide a polynomial-time assessment of the approximation error. We also provide both theoretical and practical insights on the exactness of the EP solution. Keywords: expectation consistent inference, expectation propagation, perturbation correction, Wick expansions, Ising model, Gaussian process
5 0.093943052 93 jmlr-2013-Random Walk Kernels and Learning Curves for Gaussian Process Regression on Random Graphs
Author: Matthew J. Urry, Peter Sollich
Abstract: We consider learning on graphs, guided by kernels that encode similarity between vertices. Our focus is on random walk kernels, the analogues of squared exponential kernels in Euclidean spaces. We show that on large, locally treelike graphs these have some counter-intuitive properties, specifically in the limit of large kernel lengthscales. We consider using these kernels as covariance functions of Gaussian processes. In this situation one typically scales the prior globally to normalise the average of the prior variance across vertices. We demonstrate that, in contrast to the Euclidean case, this generically leads to significant variation in the prior variance across vertices, which is undesirable from a probabilistic modelling point of view. We suggest the random walk kernel should be normalised locally, so that each vertex has the same prior variance, and analyse the consequences of this by studying learning curves for Gaussian process regression. Numerical calculations as well as novel theoretical predictions for the learning curves using belief propagation show that one obtains distinctly different probabilistic models depending on the choice of normalisation. Our method for predicting the learning curves using belief propagation is significantly more accurate than previous approximations and should become exact in the limit of large random graphs. Keywords: Gaussian process, generalisation error, learning curve, cavity method, belief propagation, graph, random walk kernel
6 0.088896491 48 jmlr-2013-Generalized Spike-and-Slab Priors for Bayesian Group Feature Selection Using Expectation Propagation
7 0.077957898 3 jmlr-2013-A Framework for Evaluating Approximation Methods for Gaussian Process Regression
8 0.068840109 121 jmlr-2013-Variational Inference in Nonconjugate Models
9 0.046679679 108 jmlr-2013-Stochastic Variational Inference
10 0.042248331 90 jmlr-2013-Quasi-Newton Method: A New Direction
11 0.041095052 15 jmlr-2013-Bayesian Canonical Correlation Analysis
12 0.038056955 43 jmlr-2013-Fast MCMC Sampling for Markov Jump Processes and Extensions
13 0.032275427 16 jmlr-2013-Bayesian Nonparametric Hidden Semi-Markov Models
14 0.025761919 49 jmlr-2013-Global Analytic Solution of Fully-observed Variational Bayesian Matrix Factorization
15 0.023474185 104 jmlr-2013-Sparse Single-Index Model
16 0.022933839 115 jmlr-2013-Training Energy-Based Models for Time-Series Imputation
17 0.022238145 102 jmlr-2013-Sparse Matrix Inversion with Scaled Lasso
18 0.022114256 9 jmlr-2013-A Widely Applicable Bayesian Information Criterion
19 0.018867323 120 jmlr-2013-Variational Algorithms for Marginal MAP
20 0.017900037 5 jmlr-2013-A Near-Optimal Algorithm for Differentially-Private Principal Components
topicId topicWeight
[(0, -0.147), (1, -0.309), (2, 0.083), (3, -0.073), (4, 0.111), (5, -0.13), (6, -0.249), (7, -0.11), (8, -0.19), (9, 0.013), (10, 0.025), (11, 0.011), (12, -0.001), (13, -0.037), (14, 0.023), (15, 0.031), (16, 0.064), (17, -0.089), (18, -0.022), (19, 0.008), (20, -0.029), (21, -0.074), (22, 0.02), (23, -0.131), (24, -0.017), (25, -0.008), (26, -0.056), (27, 0.01), (28, -0.016), (29, -0.01), (30, -0.021), (31, 0.086), (32, 0.093), (33, 0.228), (34, -0.034), (35, 0.078), (36, -0.059), (37, -0.048), (38, -0.098), (39, -0.014), (40, -0.079), (41, 0.057), (42, -0.071), (43, -0.009), (44, 0.049), (45, 0.049), (46, -0.063), (47, -0.025), (48, -0.058), (49, -0.013)]
simIndex simValue paperId paperTitle
same-paper 1 0.95956004 45 jmlr-2013-GPstuff: Bayesian Modeling with Gaussian Processes
Author: Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, Aki Vehtari
Abstract: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods. Keywords: Gaussian process, Bayesian hierarchical model, nonparametric Bayes
2 0.71994984 75 jmlr-2013-Nested Expectation Propagation for Gaussian Process Classification with a Multinomial Probit Likelihood
Author: Jaakko Riihimäki, Pasi Jylänki, Aki Vehtari
Abstract: This paper considers probabilistic multinomial probit classification using Gaussian process (GP) priors. Challenges with multiclass GP classification are the integration over the non-Gaussian posterior distribution, and the increase of the number of unknown latent variables as the number of target classes grows. Expectation propagation (EP) has proven to be a very accurate method for approximate inference but the existing EP approaches for the multinomial probit GP classification rely on numerical quadratures, or independence assumptions between the latent values associated with different classes, to facilitate the computations. In this paper we propose a novel nested EP approach which does not require numerical quadratures, and approximates accurately all betweenclass posterior dependencies of the latent values, but still scales linearly in the number of classes. The predictive accuracy of the nested EP approach is compared to Laplace, variational Bayes, and Markov chain Monte Carlo (MCMC) approximations with various benchmark data sets. In the experiments nested EP was the most consistent method compared to MCMC sampling, but in terms of classification accuracy the differences between all the methods were small from a practical point of view. Keywords: Gaussian process, multiclass classification, multinomial probit, approximate inference, expectation propagation
3 0.6734575 3 jmlr-2013-A Framework for Evaluating Approximation Methods for Gaussian Process Regression
Author: Krzysztof Chalupka, Christopher K. I. Williams, Iain Murray
Abstract: Gaussian process (GP) predictors are an important component of many Bayesian approaches to machine learning. However, even a straightforward implementation of Gaussian process regression (GPR) requires O(n2 ) space and O(n3 ) time for a data set of n examples. Several approximation methods have been proposed, but there is a lack of understanding of the relative merits of the different approximations, and in what situations they are most useful. We recommend assessing the quality of the predictions obtained as a function of the compute time taken, and comparing to standard baselines (e.g., Subset of Data and FITC). We empirically investigate four different approximation algorithms on four different prediction problems, and make our code available to encourage future comparisons. Keywords: Gaussian process regression, subset of data, FITC, local GP
4 0.53184146 47 jmlr-2013-Gaussian Kullback-Leibler Approximate Inference
Author: Edward Challis, David Barber
Abstract: We investigate Gaussian Kullback-Leibler (G-KL) variational approximate inference techniques for Bayesian generalised linear models and various extensions. In particular we make the following novel contributions: sufficient conditions for which the G-KL objective is differentiable and convex are described; constrained parameterisations of Gaussian covariance that make G-KL methods fast and scalable are provided; the lower bound to the normalisation constant provided by G-KL methods is proven to dominate those provided by local lower bounding methods; complexity and model applicability issues of G-KL versus other Gaussian approximate inference methods are discussed. Numerical results comparing G-KL and other deterministic Gaussian approximate inference methods are presented for: robust Gaussian process regression models with either Student-t or Laplace likelihoods, large scale Bayesian binary logistic regression models, and Bayesian sparse linear models for sequential experimental design. Keywords: generalised linear models, latent linear models, variational approximate inference, large scale inference, sparse learning, experimental design, active learning, Gaussian processes
5 0.49133506 88 jmlr-2013-Perturbative Corrections for Approximate Inference in Gaussian Latent Variable Models
Author: Manfred Opper, Ulrich Paquet, Ole Winther
Abstract: Expectation Propagation (EP) provides a framework for approximate inference. When the model under consideration is over a latent Gaussian field, with the approximation being Gaussian, we show how these approximations can systematically be corrected. A perturbative expansion is made of the exact but intractable correction, and can be applied to the model’s partition function and other moments of interest. The correction is expressed over the higher-order cumulants which are neglected by EP’s local matching of moments. Through the expansion, we see that EP is correct to first order. By considering higher orders, corrections of increasing polynomial complexity can be applied to the approximation. The second order provides a correction in quadratic time, which we apply to an array of Gaussian process and Ising models. The corrections generalize to arbitrarily complex approximating families, which we illustrate on tree-structured Ising model approximations. Furthermore, they provide a polynomial-time assessment of the approximation error. We also provide both theoretical and practical insights on the exactness of the EP solution. Keywords: expectation consistent inference, expectation propagation, perturbation correction, Wick expansions, Ising model, Gaussian process
6 0.43840829 93 jmlr-2013-Random Walk Kernels and Learning Curves for Gaussian Process Regression on Random Graphs
7 0.42046994 48 jmlr-2013-Generalized Spike-and-Slab Priors for Bayesian Group Feature Selection Using Expectation Propagation
8 0.27154496 15 jmlr-2013-Bayesian Canonical Correlation Analysis
9 0.23668005 90 jmlr-2013-Quasi-Newton Method: A New Direction
10 0.20406163 43 jmlr-2013-Fast MCMC Sampling for Markov Jump Processes and Extensions
11 0.18050367 121 jmlr-2013-Variational Inference in Nonconjugate Models
12 0.15191962 16 jmlr-2013-Bayesian Nonparametric Hidden Semi-Markov Models
13 0.14208299 5 jmlr-2013-A Near-Optimal Algorithm for Differentially-Private Principal Components
14 0.14089425 104 jmlr-2013-Sparse Single-Index Model
15 0.13615197 49 jmlr-2013-Global Analytic Solution of Fully-observed Variational Bayesian Matrix Factorization
17 0.12910789 9 jmlr-2013-A Widely Applicable Bayesian Information Criterion
18 0.12600943 85 jmlr-2013-Pairwise Likelihood Ratios for Estimation of Non-Gaussian Structural Equation Models
19 0.12010051 19 jmlr-2013-BudgetedSVM: A Toolbox for Scalable SVM Approximations
20 0.11888875 108 jmlr-2013-Stochastic Variational Inference
topicId topicWeight
[(0, 0.02), (5, 0.055), (6, 0.022), (10, 0.028), (61, 0.015), (70, 0.011), (75, 0.705), (87, 0.025), (93, 0.015)]
simIndex simValue paperId paperTitle
same-paper 1 0.96384943 45 jmlr-2013-GPstuff: Bayesian Modeling with Gaussian Processes
Author: Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, Aki Vehtari
Abstract: The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods. Keywords: Gaussian process, Bayesian hierarchical model, nonparametric Bayes
2 0.91102988 109 jmlr-2013-Stress Functions for Nonlinear Dimension Reduction, Proximity Analysis, and Graph Drawing
Author: Lisha Chen, Andreas Buja
Abstract: Multidimensional scaling (MDS) is the art of reconstructing pointsets (embeddings) from pairwise distance data, and as such it is at the basis of several approaches to nonlinear dimension reduction and manifold learning. At present, MDS lacks a unifying methodology as it consists of a discrete collection of proposals that differ in their optimization criteria, called “stress functions”. To correct this situation we propose (1) to embed many of the extant stress functions in a parametric family of stress functions, and (2) to replace the ad hoc choice among discrete proposals with a principled parameter selection method. This methodology yields the following benefits and problem solutions: (a) It provides guidance in tailoring stress functions to a given data situation, responding to the fact that no single stress function dominates all others across all data situations; (b) the methodology enriches the supply of available stress functions; (c) it helps our understanding of stress functions by replacing the comparison of discrete proposals with a characterization of the effect of parameters on embeddings; (d) it builds a bridge to graph drawing, which is the related but not identical art of constructing embeddings from graphs. Keywords: multidimensional scaling, force-directed layout, cluster analysis, clustering strength, unsupervised learning, Box-Cox transformations
3 0.82589376 115 jmlr-2013-Training Energy-Based Models for Time-Series Imputation
Author: Philémon Brakel, Dirk Stroobandt, Benjamin Schrauwen
Abstract: Imputing missing values in high dimensional time-series is a difficult problem. This paper presents a strategy for training energy-based graphical models for imputation directly, bypassing difficulties probabilistic approaches would face. The training strategy is inspired by recent work on optimization-based learning (Domke, 2012) and allows complex neural models with convolutional and recurrent structures to be trained for imputation tasks. In this work, we use this training strategy to derive learning rules for three substantially different neural architectures. Inference in these models is done by either truncated gradient descent or variational mean-field iterations. In our experiments, we found that the training methods outperform the Contrastive Divergence learning algorithm. Moreover, the training methods can easily handle missing values in the training data itself during learning. We demonstrate the performance of this learning scheme and the three models we introduce on one artificial and two real-world data sets. Keywords: neural networks, energy-based models, time-series, missing values, optimization
4 0.7926439 23 jmlr-2013-Cluster Analysis: Unsupervised Learning via Supervised Learning with a Non-convex Penalty
Author: Wei Pan, Xiaotong Shen, Binghui Liu
Abstract: Clustering analysis is widely used in many fields. Traditionally clustering is regarded as unsupervised learning for its lack of a class label or a quantitative response variable, which in contrast is present in supervised learning such as classification and regression. Here we formulate clustering as penalized regression with grouping pursuit. In addition to the novel use of a non-convex group penalty and its associated unique operating characteristics in the proposed clustering method, a main advantage of this formulation is its allowing borrowing some well established results in classification and regression, such as model selection criteria to select the number of clusters, a difficult problem in clustering analysis. In particular, we propose using the generalized cross-validation (GCV) based on generalized degrees of freedom (GDF) to select the number of clusters. We use a few simple numerical examples to compare our proposed method with some existing approaches, demonstrating our method’s promising performance. Keywords: generalized degrees of freedom, grouping, K-means clustering, Lasso, penalized regression, truncated Lasso penalty (TLP)
5 0.73373383 21 jmlr-2013-Classifier Selection using the Predicate Depth
Author: Ran Gilad-Bachrach, Christopher J.C. Burges
Abstract: Typically, one approaches a supervised machine learning problem by writing down an objective function and finding a hypothesis that minimizes it. This is equivalent to finding the Maximum A Posteriori (MAP) hypothesis for a Boltzmann distribution. However, MAP is not a robust statistic. We present an alternative approach by defining a median of the distribution, which we show is both more robust, and has good generalization guarantees. We present algorithms to approximate this median. One contribution of this work is an efficient method for approximating the Tukey median. The Tukey median, which is often used for data visualization and outlier detection, is a special case of the family of medians we define: however, computing it exactly is exponentially slow in the dimension. Our algorithm approximates such medians in polynomial time while making weaker assumptions than those required by previous work. Keywords: classification, estimation, median, Tukey depth
6 0.55115151 75 jmlr-2013-Nested Expectation Propagation for Gaussian Process Classification with a Multinomial Probit Likelihood
7 0.47932315 3 jmlr-2013-A Framework for Evaluating Approximation Methods for Gaussian Process Regression
8 0.44554493 47 jmlr-2013-Gaussian Kullback-Leibler Approximate Inference
9 0.42223027 120 jmlr-2013-Variational Algorithms for Marginal MAP
10 0.39082089 86 jmlr-2013-Parallel Vector Field Embedding
11 0.38763854 108 jmlr-2013-Stochastic Variational Inference
12 0.38602251 118 jmlr-2013-Using Symmetry and Evolutionary Search to Minimize Sorting Networks
13 0.3694379 88 jmlr-2013-Perturbative Corrections for Approximate Inference in Gaussian Latent Variable Models
14 0.35840765 93 jmlr-2013-Random Walk Kernels and Learning Curves for Gaussian Process Regression on Random Graphs
15 0.35616741 32 jmlr-2013-Differential Privacy for Functions and Functional Data
16 0.34837729 38 jmlr-2013-Dynamic Affine-Invariant Shape-Appearance Handshape Features and Classification in Sign Language Videos
17 0.34025571 59 jmlr-2013-Large-scale SVD and Manifold Learning
19 0.33625892 121 jmlr-2013-Variational Inference in Nonconjugate Models
20 0.33470964 22 jmlr-2013-Classifying With Confidence From Incomplete Information