jmlr jmlr2012 jmlr2012-119 knowledge-graph by maker-knowledge-mining

119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

Source: pdf

Author: Hannes Nickisch

Abstract: The glm-ie toolbox contains functionality for estimation and inference in generalised linear models over continuous-valued variables. Besides a variety of penalised least squares solvers for estimation, it offers inference based on (convex) variational bounds, on expectation propagation and on factorial mean ﬁeld. Scalable and efﬁcient inference in fully-connected undirected graphical models or Markov random ﬁelds with Gaussian and non-Gaussian potentials is achieved by casting all the computations as matrix vector multiplications. We provide a wide choice of penalty functions for estimation, potential functions for inference and matrix classes with lazy evaluation for convenient modelling. We designed the glm-ie package to be simple, generic and easily expansible. Most of the code is written in Matlab including some MEX ﬁles to be fully compatible to both Matlab 7.x and GNU Octave 3.3.x. Large scale probabilistic classiﬁcation as well as sparse linear modelling can be performed in a common algorithmical framework by the glm-ie toolkit. Keywords: sparse linear models, generalised linear models, Bayesian inference, approximate inference, probabilistic regression and classiﬁcation, penalised least squares estimation, lazy evaluation matrix class

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ORG Max Planck Institute for Biological Cybernetics Spemannstraße 38 72076 T¨ bingen, Germany u Editor: Mikio Braun Abstract The glm-ie toolbox contains functionality for estimation and inference in generalised linear models over continuous-valued variables. [sent-2, score-0.5]

2 Besides a variety of penalised least squares solvers for estimation, it offers inference based on (convex) variational bounds, on expectation propagation and on factorial mean ﬁeld. [sent-3, score-0.728]

3 Scalable and efﬁcient inference in fully-connected undirected graphical models or Markov random ﬁelds with Gaussian and non-Gaussian potentials is achieved by casting all the computations as matrix vector multiplications. [sent-4, score-0.421]

4 We provide a wide choice of penalty functions for estimation, potential functions for inference and matrix classes with lazy evaluation for convenient modelling. [sent-5, score-0.479]

5 Most of the code is written in Matlab including some MEX ﬁles to be fully compatible to both Matlab 7. [sent-7, score-0.062]

6 Keywords: sparse linear models, generalised linear models, Bayesian inference, approximate inference, probabilistic regression and classiﬁcation, penalised least squares estimation, lazy evaluation matrix class 1. [sent-12, score-0.433]

7 Estimation of the unknown parameters by maximum likelihood and penalised variants thereof can be done by iteratively reweighted least squares (IRLS). [sent-15, score-0.226]

8 Penalised least squares (PLS) corresponds to maximum a posteriori estimation (MAP) in a Bayesian model. [sent-16, score-0.105]

9 (Approximate) Bayesian inference as opposed to MAP places the parameter estimate at the centre of mass rather than at the mode of the posterior distribution. [sent-17, score-0.242]

10 , 2010), • Variational Bounding (VB) (Seeger and Nickisch, 2011) based on decoupling (Wipf and Nagarajan, 2008) and (convex) (Nickisch and Seeger, 2009) optimisation, and • Mean ﬁeld (MF) (Miskin and MacKay, 2000) factorial inference. [sent-19, score-0.062]

11 N ICKISCH While EP yields very accurate approximations for small models, VB allows for efﬁcient computations in large-scale models by a sequence of variance-smoothed PLS problems, which makes experimental design for imaging applications (Seeger et al. [sent-21, score-0.151]

12 NET framework both of which do not offer scalable matrix vector multiplication (MVM) based approximate inference. [sent-24, score-0.082]

13 Implementation and Model Class The glm-ie toolbox can be obtained from http://mloss. [sent-26, score-0.161]

14 Based on simple interfaces for potential, penalty, and estimation functions as well as inference methods and matrix classes, we offer full compatibility to Matlab 7. [sent-30, score-0.357]

15 Our documentation comes in two parts: (i) a hypertext document3 doc/index. [sent-36, score-0.056]

16 pdf explaining the interfaces to allow inclusion of new functionality. [sent-38, score-0.05]

17 Overall, a GLM can be speciﬁed by three kinds of objects: (i) potentials T (s) and penalties ρ(s), (ii) matrices X, B and (iii) PLS algorithms. [sent-45, score-0.159]

18 Together with the responses y, scalar parameters and optimisation options, these three constituents serve as inputs to the double loop inference engine dli computing approximations to ln Z, m and V. [sent-46, score-0.599]

19 1 Potential and Penalty Functions Several non-Gaussian potentials T (s)5 can be used to shape the posterior distribution of Equation (1). [sent-70, score-0.226]

20 In addition to the Gaussian potential potGauss, we provide several sparse potentials (exponential power potExpPow, Laplace potLaplace, Sech-squared6 potSech2, Student’s t potT) and the logistic potential potLogistic for binary classiﬁcation. [sent-71, score-0.398]

21 In MAP estimation, we most naturally use the penalty function ρ(s) = − ln T (s) in Equation (2) which includes penalty functions like p-norms penAbs, penQuad, penPow. [sent-72, score-0.266]

22 Approximate inference by variational bounding requires penalty functions derived from a potential function, for example, penVB or penVBNorm. [sent-73, score-0.528]

23 2 General Matrix Class and Implementations To facilitate the speciﬁcation and composition of system matrices X, B (and their respective transposes) in a GLM, the glm-ie toolbox contains a specialised matrix class mat. [sent-75, score-0.259]

24 As MVMs form the most important computations in both estimation and inference, the class mat provides addition, transposition, scaling, composition, and concatenation. [sent-76, score-0.204]

25 Thus, expressions like A+B’, A*B, A*x, a*A, [A,B], [A;B], A(:,1), kron(A,B), repmat(A,[2,3]) are possible even though A or B are of type mat and have a size that would not ﬁt into memory if stored as a dense array. [sent-77, score-0.103]

26 We provide several matrix classes that are derived from mat and implement their own MVM. [sent-79, score-0.139]

27 Besides 2d convolution matConv2, diagonal matDiag, ﬁnite difference matFD2 and (quadrature mirror) wavelet matrices matWav, we offer three kinds of Fourier matrices matFFT2line, matFFTNmask and matFFT2nu allowing for nonuniform spacing. [sent-80, score-0.173]

28 Computations only take place through MVMs; the other operations (addition, composition etc. [sent-81, score-0.062]

29 3 Penalised Least Squares Solvers The glm-ie toolbox contains several solvers for the PLS estimation problem of Equation (2): • plsCG: Conjugate Gradients (CG) using a standalone solver (Rasmussen, 2006), • plsCGBT: CG with an Armijo backtracking rule (Lustig et al. [sent-84, score-0.356]

30 , 2009), • plsLBFGS: uses a wrapper for the famous LBFGSB code (L-BFGS-B, 1997), • plsBB: ﬁrst order two-point step size rule (Barzilai and Borwein, 1988), and • plsSB: Bregman Splitting (Goldstein and Osher, 2009). [sent-86, score-0.062]

31 The solvers can be used for a standalone estimation task or—together with the penVB penalty function—as the inner loop of the double loop variational inference algorithm. [sent-87, score-0.822]

32 We use an additional scale parameter τ, that is, the rescaled potential T (τs). [sent-89, score-0.098]

33 The sech-square distribution is another name for the logistic distribution. [sent-91, score-0.043]

34 We use sech-square to avoid a name clash with the logistic classiﬁcation potential. [sent-92, score-0.043]

35 Example and Code To illustrate the modular structure of the glm-ie toolbox, we provide a simple code example7 . [sent-94, score-0.112]

36 We use a sparse linear model to describe Fourier measurements y = Xu + ε ∈ Cm of an unknown pixel image u ∈ Rn where the design matrix X contains a subset of the rows of the Fourier matrix F, that is, X = MF as speciﬁed by a measurement masking matrix M ∈ {0, 1}m×n . [sent-95, score-0.152]

37 As a prior, we employ the knowledge that the ﬁlter responses s = Bu of natural images with zero mean ﬁlters b j , j=1. [sent-96, score-0.036]

38 q follow a sparse distribution imitated by the Laplace potential T j (s j ) = e−τ|s j | . [sent-98, score-0.098]

39 More speciﬁcally, the ﬁlter matrix B contains multiscale derivatives; it is a concatenation of ﬁnite differences in both image directions and wavelet coefﬁcients. [sent-99, score-0.154]

40 This model allows to reconstruct images from undersampled magnetic resonance imaging scanner measurements, where m < n. [sent-100, score-0.36]

41 The observation noise variance σ2 and the design matrix X as declared in lines 2-3 form the Gaussian part. [sent-104, score-0.098]

42 The ﬁlter matrix B for the non-Gaussian part is constructed in line 4 as the concatenation of two matrices. [sent-105, score-0.074]

43 In the next two lines, we perform estimation by optimising equation (2) using the plsLBFGS solver in line 6 for the penalty function ρ(s) = ∑ j |s j | as declared in line 5. [sent-107, score-0.225]

44 Variational inference requires a potential T (τs); here T (τs) = exp(−τ|s|), with scale τ, is deﬁned in lines 7-8. [sent-108, score-0.273]

45 The engine for double loop inference dli is ﬁnally called in line 9 yielding the posterior mean estimate m, the variational parameters γ and β, the marginal variances z = var(Bu), zu = var(u) and the negative log evidence − ln Z ≡ nlZ of the model. [sent-109, score-0.722]

46 Acknowledgments Thanks go to George Papandreou for contributing his marginal variance estimator (Papandreou and Yuille, 2011), to Matthias Seeger for helpful suggestions, Michael Hirsch for testing and contributing the convolution code and Wolf Blecher for testing. [sent-110, score-0.215]

47 The example is a shortened version of one the detailed application examples, which are part of the documentation of the glm-ie package. [sent-116, score-0.056]

48 Sparse MRI: The application of compressed sensing for rapid MR imaging. [sent-128, score-0.097]

49 Convex variational Bayesian inference for large scale generalized linear models. [sent-139, score-0.317]

50 Large scale variational inference and experimental design for sparse generalized linear models. [sent-159, score-0.317]

51 Bayesian experio mental design of magnetic resonance imaging sequences. [sent-163, score-0.36]

52 Optimization of ko space trajectories for compressed sensing by bayesian experimental design. [sent-167, score-0.169]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('pls', 0.28), ('hannes', 0.24), ('nickisch', 0.24), ('seeger', 0.181), ('inference', 0.175), ('penalised', 0.171), ('toolbox', 0.161), ('potentials', 0.159), ('variational', 0.142), ('resonance', 0.137), ('mask', 0.137), ('matthias', 0.125), ('magnetic', 0.123), ('dli', 0.12), ('glms', 0.12), ('mvms', 0.12), ('papandreou', 0.12), ('plslbfgs', 0.12), ('umap', 0.12), ('generalised', 0.114), ('penalty', 0.113), ('bu', 0.106), ('mat', 0.103), ('define', 0.103), ('imaging', 0.1), ('potential', 0.098), ('octave', 0.093), ('xu', 0.082), ('barzilai', 0.08), ('eneralised', 0.08), ('gerven', 0.08), ('ickisch', 0.08), ('lustig', 0.08), ('matfftnmask', 0.08), ('matwav', 0.08), ('miskin', 0.08), ('nlz', 0.08), ('penabs', 0.08), ('penvb', 0.08), ('pohmann', 0.08), ('potlaplace', 0.08), ('rolf', 0.08), ('standalone', 0.08), ('fourier', 0.08), ('laplace', 0.08), ('wavelet', 0.08), ('vb', 0.08), ('su', 0.075), ('bayesian', 0.072), ('tom', 0.071), ('double', 0.069), ('cg', 0.068), ('glm', 0.068), ('posterior', 0.067), ('solvers', 0.065), ('loop', 0.064), ('code', 0.062), ('composition', 0.062), ('goldstein', 0.062), ('wipf', 0.062), ('declared', 0.062), ('irls', 0.062), ('factorial', 0.062), ('propagation', 0.058), ('nference', 0.057), ('jonathan', 0.057), ('lazy', 0.057), ('lter', 0.056), ('documentation', 0.056), ('squares', 0.055), ('contributing', 0.053), ('pen', 0.053), ('mf', 0.053), ('computations', 0.051), ('compressed', 0.05), ('interfaces', 0.05), ('optimisation', 0.05), ('stimation', 0.05), ('modular', 0.05), ('estimation', 0.05), ('opper', 0.047), ('sensing', 0.047), ('george', 0.047), ('convolution', 0.047), ('offer', 0.046), ('engine', 0.045), ('matlab', 0.045), ('measurements', 0.044), ('ep', 0.043), ('logistic', 0.043), ('gnu', 0.042), ('ln', 0.04), ('inear', 0.04), ('odels', 0.04), ('concatenation', 0.038), ('david', 0.037), ('matrix', 0.036), ('responses', 0.036), ('bernhard', 0.035), ('bregman', 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999958 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

Author: Hannes Nickisch

2 0.086379297 118 jmlr-2012-Variational Multinomial Logit Gaussian Process

Author: Kian Ming A. Chai

Abstract: Gaussian process prior with an appropriate likelihood function is a ﬂexible non-parametric model for a variety of learning tasks. One important and standard task is multi-class classiﬁcation, which is the categorization of an item into one of several ﬁxed classes. A usual likelihood function for this is the multinomial logistic likelihood function. However, exact inference with this model has proved to be difﬁcult because high-dimensional integrations are required. In this paper, we propose a variational approximation to this model, and we describe the optimization of the variational parameters. Experiments have shown our approximation to be tight. In addition, we provide dataindependent bounds on the marginal likelihood of the model, one of which is shown to be much tighter than the existing variational mean-ﬁeld bound in the experiments. We also derive a proper lower bound on the predictive likelihood that involves the Kullback-Leibler divergence between the approximating and the true posterior. We combine our approach with a recently proposed sparse approximation to give a variational sparse approximation to the Gaussian process multi-class model. We also derive criteria which can be used to select the inducing set, and we show the effectiveness of these criteria over random selection in an experiment. Keywords: Gaussian process, probabilistic classiﬁcation, multinomial logistic, variational approximation, sparse approximation

3 0.085465744 21 jmlr-2012-Bayesian Mixed-Effects Inference on Classification Performance in Hierarchical Data Sets

Author: Kay H. Brodersen, Christoph Mathys, Justin R. Chumbley, Jean Daunizeau, Cheng Soon Ong, Joachim M. Buhmann, Klaas E. Stephan

Abstract: Classiﬁcation algorithms are frequently used on data with a natural hierarchical structure. For instance, classiﬁers are often trained and tested on trial-wise measurements, separately for each subject within a group. One important question is how classiﬁcation outcomes observed in individual subjects can be generalized to the population from which the group was sampled. To address this question, this paper introduces novel statistical models that are guided by three desiderata. First, all models explicitly respect the hierarchical nature of the data, that is, they are mixed-effects models that simultaneously account for within-subjects (ﬁxed-effects) and across-subjects (random-effects) variance components. Second, maximum-likelihood estimation is replaced by full Bayesian inference in order to enable natural regularization of the estimation problem and to afford conclusions in terms of posterior probability statements. Third, inference on classiﬁcation accuracy is complemented by inference on the balanced accuracy, which avoids inﬂated accuracy estimates for imbalanced data sets. We introduce hierarchical models that satisfy these criteria and demonstrate their advantages over conventional methods using MCMC implementations for model inversion and model selection on both synthetic and empirical data. We envisage that our approach will improve the sensitivity and validity of statistical inference in future hierarchical classiﬁcation studies. Keywords: beta-binomial, normal-binomial, balanced accuracy, Bayesian inference, group studies

4 0.069431409 47 jmlr-2012-GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression

Author: Chiwoo Park, Jianhua Z. Huang, Yu Ding

Abstract: This paper presents the Getting-started style documentation for the local and parallel computation toolbox for Gaussian process regression (GPLP), an open source software package written in Matlab (but also compatible with Octave). The working environment and the usage of the software package will be presented in this paper. Keywords: Gaussian process regression, domain decomposition method, partial independent conditional, bagging for Gaussian process, local probabilistic regression

5 0.062950827 9 jmlr-2012-A Topic Modeling Toolbox Using Belief Propagation

Author: Jia Zeng

Abstract: Latent Dirichlet allocation (LDA) is an important hierarchical Bayesian model for probabilistic topic modeling, which attracts worldwide interests and touches on many important applications in text mining, computer vision and computational biology. This paper introduces a topic modeling toolbox (TMBP) based on the belief propagation (BP) algorithms. TMBP toolbox is implemented by MEX C++/Matlab/Octave for either Windows 7 or Linux. Compared with existing topic modeling packages, the novelty of this toolbox lies in the BP algorithms for learning LDA-based topic models. The current version includes BP algorithms for latent Dirichlet allocation (LDA), authortopic models (ATM), relational topic models (RTM), and labeled LDA (LaLDA). This toolbox is an ongoing project and more BP-based algorithms for various topic models will be added in the near future. Interested users may also extend BP algorithms for learning more complicated topic models. The source codes are freely available under the GNU General Public Licence, Version 1.0 at https://mloss.org/software/view/399/. Keywords: topic models, belief propagation, variational Bayes, Gibbs sampling

6 0.061100017 35 jmlr-2012-EP-GIG Priors and Applications in Bayesian Sparse Learning

7 0.053458367 39 jmlr-2012-Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

8 0.05050993 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

9 0.050182886 38 jmlr-2012-Entropy Search for Information-Efficient Global Optimization

10 0.041268248 56 jmlr-2012-Learning Linear Cyclic Causal Models with Latent Variables

11 0.036437966 65 jmlr-2012-MedLDA: Maximum Margin Supervised Topic Models

12 0.03538375 76 jmlr-2012-Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics

13 0.035277005 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

14 0.034009531 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

15 0.033863407 59 jmlr-2012-Linear Regression With Random Projections

16 0.030916339 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

17 0.030838594 48 jmlr-2012-High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion

18 0.030505529 97 jmlr-2012-Regularization Techniques for Learning with Matrices

19 0.030463988 73 jmlr-2012-Multi-task Regression using Minimal Penalties

20 0.030074028 82 jmlr-2012-On the Necessity of Irrelevant Variables

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.147), (1, 0.085), (2, 0.117), (3, -0.06), (4, 0.074), (5, 0.077), (6, 0.059), (7, 0.028), (8, -0.205), (9, -0.03), (10, 0.034), (11, -0.0), (12, 0.08), (13, 0.1), (14, 0.068), (15, -0.114), (16, -0.045), (17, 0.102), (18, -0.056), (19, -0.157), (20, 0.077), (21, 0.048), (22, -0.086), (23, 0.053), (24, -0.067), (25, -0.108), (26, -0.028), (27, -0.028), (28, 0.044), (29, 0.088), (30, -0.009), (31, 0.03), (32, 0.045), (33, 0.122), (34, 0.023), (35, -0.08), (36, -0.053), (37, 0.202), (38, 0.041), (39, -0.027), (40, 0.109), (41, 0.062), (42, -0.344), (43, -0.115), (44, -0.076), (45, 0.227), (46, 0.028), (47, -0.166), (48, 0.024), (49, -0.047)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96507919 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

Author: Hannes Nickisch

2 0.44151786 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

Author: Stephen Gould

Abstract: We present an open-source platform-independent C++ framework for machine learning and computer vision research. The framework includes a wide range of standard machine learning and graphical models algorithms as well as reference implementations for many machine learning and computer vision applications. The framework contains Matlab wrappers for core components of the library and an experimental graphical user interface for developing and visualizing machine learning data ﬂows. Keywords: machine learning, graphical models, computer vision, open-source software

3 0.43078265 21 jmlr-2012-Bayesian Mixed-Effects Inference on Classification Performance in Hierarchical Data Sets

Author: Kay H. Brodersen, Christoph Mathys, Justin R. Chumbley, Jean Daunizeau, Cheng Soon Ong, Joachim M. Buhmann, Klaas E. Stephan

4 0.35841492 47 jmlr-2012-GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression

Author: Chiwoo Park, Jianhua Z. Huang, Yu Ding

5 0.35536364 118 jmlr-2012-Variational Multinomial Logit Gaussian Process

Author: Kian Ming A. Chai

6 0.32971361 56 jmlr-2012-Learning Linear Cyclic Causal Models with Latent Variables

7 0.29961216 38 jmlr-2012-Entropy Search for Information-Efficient Global Optimization

8 0.28315598 9 jmlr-2012-A Topic Modeling Toolbox Using Belief Propagation

9 0.27600858 35 jmlr-2012-EP-GIG Priors and Applications in Bayesian Sparse Learning

10 0.22840309 76 jmlr-2012-Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics

11 0.21828108 53 jmlr-2012-Jstacs: A Java Framework for Statistical Analysis and Classification of Biological Sequences

12 0.21526559 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

13 0.21466692 52 jmlr-2012-Iterative Reweighted Algorithms for Matrix Rank Minimization

14 0.20348018 39 jmlr-2012-Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

15 0.20026687 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

16 0.19336894 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

17 0.19203204 79 jmlr-2012-Oger: Modular Learning Architectures For Large-Scale Sequential Processing

18 0.1855647 65 jmlr-2012-MedLDA: Maximum Margin Supervised Topic Models

19 0.17763668 2 jmlr-2012-A Comparison of the Lasso and Marginal Regression

20 0.16568132 32 jmlr-2012-Discriminative Hierarchical Part-based Models for Human Parsing and Action Recognition

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.01), (7, 0.012), (21, 0.055), (26, 0.032), (27, 0.019), (29, 0.059), (43, 0.471), (49, 0.047), (56, 0.026), (57, 0.012), (64, 0.011), (69, 0.019), (75, 0.026), (92, 0.048), (96, 0.069)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.7976166 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

Author: Hannes Nickisch

2 0.69246411 57 jmlr-2012-Learning Symbolic Representations of Hybrid Dynamical Systems

Author: Daniel L. Ly, Hod Lipson

Abstract: A hybrid dynamical system is a mathematical model suitable for describing an extensive spectrum of multi-modal, time-series behaviors, ranging from bouncing balls to air trafﬁc controllers. This paper describes multi-modal symbolic regression (MMSR): a learning algorithm to construct non-linear symbolic representations of discrete dynamical systems with continuous mappings from unlabeled, time-series data. MMSR consists of two subalgorithms—clustered symbolic regression, a method to simultaneously identify distinct behaviors while formulating their mathematical expressions, and transition modeling, an algorithm to infer symbolic inequalities that describe binary classiﬁcation boundaries. These subalgorithms are combined to infer hybrid dynamical systems as a collection of apt, mathematical expressions. MMSR is evaluated on a collection of four synthetic data sets and outperforms other multi-modal machine learning approaches in both accuracy and interpretability, even in the presence of noise. Furthermore, the versatility of MMSR is demonstrated by identifying and inferring classical expressions of transistor modes from recorded measurements. Keywords: hybrid dynamical systems, evolutionary computation, symbolic piecewise functions, symbolic binary classiﬁcation

3 0.23900399 81 jmlr-2012-On the Convergence Rate oflp-Norm Multiple Kernel Learning

Author: Marius Kloft, Gilles Blanchard

Abstract: We derive an upper bound on the local Rademacher complexity of ℓ p -norm multiple kernel learning, which yields a tighter excess risk bound than global approaches. Previous local approaches analyzed the case p = 1 only while our analysis covers all cases 1 ≤ p ≤ ∞, assuming the different feature mappings corresponding to the different kernels to be uncorrelated. We also show a lower bound that shows that the bound is tight, and derive consequences regarding excess loss, namely α fast convergence rates of the order O(n− 1+α ), where α is the minimum eigenvalue decay rate of the individual kernels. Keywords: multiple kernel learning, learning kernels, generalization bounds, local Rademacher complexity

4 0.23448788 35 jmlr-2012-EP-GIG Priors and Applications in Bayesian Sparse Learning

Author: Zhihua Zhang, Shusen Wang, Dehua Liu, Michael I. Jordan

Abstract: In this paper we propose a novel framework for the construction of sparsity-inducing priors. In particular, we deﬁne such priors as a mixture of exponential power distributions with a generalized inverse Gaussian density (EP-GIG). EP-GIG is a variant of generalized hyperbolic distributions, and the special cases include Gaussian scale mixtures and Laplace scale mixtures. Furthermore, Laplace scale mixtures can subserve a Bayesian framework for sparse learning with nonconvex penalization. The densities of EP-GIG can be explicitly expressed. Moreover, the corresponding posterior distribution also follows a generalized inverse Gaussian distribution. We exploit these properties to develop EM algorithms for sparse empirical Bayesian learning. We also show that these algorithms bear an interesting resemblance to iteratively reweighted ℓ2 or ℓ1 methods. Finally, we present two extensions for grouped variable selection and logistic regression. Keywords: sparsity priors, scale mixtures of exponential power distributions, generalized inverse Gaussian distributions, expectation-maximization algorithms, iteratively reweighted minimization methods

5 0.23412682 4 jmlr-2012-A Kernel Two-Sample Test

Author: Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, Alexander Smola

Abstract: We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distributionfree tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efﬁcient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the ﬁrst such tests. ∗. †. ‡. §. Also at Gatsby Computational Neuroscience Unit, CSML, 17 Queen Square, London WC1N 3AR, UK. This work was carried out while K.M.B. was with the Ludwig-Maximilians-Universit¨ t M¨ nchen. a u This work was carried out while M.J.R. was with the Graz University of Technology. Also at The Australian National University, Canberra, ACT 0200, Australia. c 2012 Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Sch¨ lkopf and Alexander Smola. o ¨ G RETTON , B ORGWARDT, R ASCH , S CH OLKOPF AND S MOLA Keywords: kernel methods, two-sample test, uniform convergence bounds, schema matching, integral probability metric, hypothesis testing

6 0.22918858 38 jmlr-2012-Entropy Search for Information-Efficient Global Optimization

7 0.2289221 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

8 0.22692212 117 jmlr-2012-Variable Selection in High-dimensional Varying-coefficient Models with Global Optimality

9 0.22453293 8 jmlr-2012-A Primal-Dual Convergence Analysis of Boosting

10 0.22413531 21 jmlr-2012-Bayesian Mixed-Effects Inference on Classification Performance in Hierarchical Data Sets

11 0.22401974 85 jmlr-2012-Optimal Distributed Online Prediction Using Mini-Batches

12 0.2238286 103 jmlr-2012-Sampling Methods for the Nyström Method

13 0.22312036 11 jmlr-2012-A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models

14 0.2220006 64 jmlr-2012-Manifold Identification in Dual Averaging for Regularized Stochastic Online Learning

15 0.22136497 92 jmlr-2012-Positive Semidefinite Metric Learning Using Boosting-like Algorithms

16 0.22117338 72 jmlr-2012-Multi-Target Regression with Rule Ensembles

17 0.22090828 56 jmlr-2012-Learning Linear Cyclic Causal Models with Latent Variables

18 0.21895999 114 jmlr-2012-Towards Integrative Causal Analysis of Heterogeneous Data Sets and Studies

19 0.21879765 98 jmlr-2012-Regularized Bundle Methods for Convex and Non-Convex Risks

20 0.21873504 73 jmlr-2012-Multi-task Regression using Minimal Penalties