jmlr jmlr2008 jmlr2008-85 knowledge-graph by maker-knowledge-mining

85 jmlr-2008-Shark (Machine Learning Open Source Software Paper)

Source: pdf

Author: Christian Igel, Verena Heidrich-Meisner, Tobias Glasmachers

Abstract: SHARK is an object-oriented library for the design of adaptive systems. It comprises methods for single- and multi-objective optimization (e.g., evolutionary and gradient-based algorithms) as well as kernel-based methods, neural networks, and other machine learning techniques. Keywords: machine learning software, neural networks, kernel-methods, evolutionary algorithms, optimization, multi-objective-optimization 1. Overview SHARK is a modular C++ library for the design and optimization of adaptive systems. It serves as a toolbox for real world applications and basic research in computational intelligence and machine learning. The library provides methods for single- and multi-objective optimization, in particular evolutionary and gradient-based algorithms, kernel-based learning methods, neural networks, and many other machine learning techniques. Its main design criteria are ﬂexibility and speed. Here we restrict the description of SHARK to its core components, albeit the library contains plenty of additional functionality. Further information can be obtained from the HTML documentation and tutorials. More than 60 illustrative example programs serve as starting points for using SHARK. 2. Basic Tools—Rng, Array, and LinAlg The library provides general auxiliary functions and data structures for the development of machine learning algorithms. The Rng module generates reproducible and platform independent sequences of pseudo random numbers, which can be drawn from 14 predeﬁned discrete and continuous parametric distributions. The Array class provides dynamical array templates of arbitrary type and dimension as well as basic operations acting on these templates. LinAlg implements linear algebra algorithms such as matrix inversion and singular value decomposition. 3. ReClaM—Regression and Classiﬁcation Methods The goal of the ReClaM module is to provide machine learning algorithms for supervised classiﬁcation and regression in a uniﬁed, modular framework. It is built like a construction kit, where the main building blocks are adaptive data processing models, error functions, and optimization c 2008 Christian Igel, Verena Heidrich-Meisner and Tobias Glasmachers. I GEL , H EIDRICH -M EISNER AND G LASMACHERS 8 90736D 3 ¨¥¨¥¥£ ¡ §§©§¦¤¢ init(...) optimize(...) E 8973 B@ 6 4C3 A 86 973 543 %$#¨!

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 DE Institut f¨ r Neuroinformatik u Ruhr-Universit¨ t Bochum a 44780 Bochum, Germany Editor: Soeren Sonnenburg Abstract SHARK is an object-oriented library for the design of adaptive systems. [sent-10, score-0.173]

2 It comprises methods for single- and multi-objective optimization (e. [sent-11, score-0.099]

3 , evolutionary and gradient-based algorithms) as well as kernel-based methods, neural networks, and other machine learning techniques. [sent-13, score-0.242]

4 Keywords: machine learning software, neural networks, kernel-methods, evolutionary algorithms, optimization, multi-objective-optimization 1. [sent-14, score-0.242]

5 Overview SHARK is a modular C++ library for the design and optimization of adaptive systems. [sent-15, score-0.291]

6 It serves as a toolbox for real world applications and basic research in computational intelligence and machine learning. [sent-16, score-0.042]

7 The library provides methods for single- and multi-objective optimization, in particular evolutionary and gradient-based algorithms, kernel-based learning methods, neural networks, and many other machine learning techniques. [sent-17, score-0.358]

8 Here we restrict the description of SHARK to its core components, albeit the library contains plenty of additional functionality. [sent-19, score-0.145]

9 Further information can be obtained from the HTML documentation and tutorials. [sent-20, score-0.025]

10 Basic Tools—Rng, Array, and LinAlg The library provides general auxiliary functions and data structures for the development of machine learning algorithms. [sent-23, score-0.116]

11 The Rng module generates reproducible and platform independent sequences of pseudo random numbers, which can be drawn from 14 predeﬁned discrete and continuous parametric distributions. [sent-24, score-0.181]

12 The Array class provides dynamical array templates of arbitrary type and dimension as well as basic operations acting on these templates. [sent-25, score-0.122]

13 LinAlg implements linear algebra algorithms such as matrix inversion and singular value decomposition. [sent-26, score-0.027]

14 ReClaM—Regression and Classiﬁcation Methods The goal of the ReClaM module is to provide machine learning algorithms for supervised classiﬁcation and regression in a uniﬁed, modular framework. [sent-28, score-0.149]

15 It is built like a construction kit, where the main building blocks are adaptive data processing models, error functions, and optimization c 2008 Christian Igel, Verena Heidrich-Meisner and Tobias Glasmachers. [sent-29, score-0.108]

16 ) Figure 1: Almost all ReClaM objects are inherited from one of the three base classes Model, ErrorFunction, and Optimizer. [sent-58, score-0.027]

17 The optimizer has access to the parameter vector w of the model f : Rn × R p → Rm , (x, w) → fw (x), to minimize a scalar error function E. [sent-59, score-0.053]

18 The superclasses representing these components communicate through ﬁxed interfaces. [sent-63, score-0.029]

19 A problem is deﬁned by a model deﬁning a parametric family of candidate hypotheses, and a possibly regularized error function to minimize (and, of course, sample data). [sent-65, score-0.055]

20 It is usually solved with an (iterative) optimization algorithm, which adapts the model parameters in order to minimize the error function evaluated on the given data set. [sent-66, score-0.103]

21 It offers a variety of predeﬁned network models including feed-forward and recurrent multi-layer perceptron networks, radial basis function networks, and CMACs. [sent-70, score-0.073]

22 Several gradient-based optimization algorithms are available for network training and general purpose optimization including the conjugate gradient method, the ¨ BFGS algorithm, and improved Rprop (Igel and Husken, 2003). [sent-71, score-0.152]

23 The library offers kernelized versions of several learning machines from nearest neighbor classiﬁers and simple Gaussian processes to different ﬂavors of support vector machines. [sent-73, score-0.189]

24 These algorithms operate on general kernel objects and users can supply new kernel functions easily. [sent-74, score-0.093]

25 The SVM training automatically switches between the most efﬁcient SMO-like algorithms available depending on the current problem size (Fan et al. [sent-76, score-0.027]

26 On top of these models, ReClaM deﬁnes meta-models for model selection of kernel and regularization parameters. [sent-78, score-0.035]

27 It offers more objective functions and optimization methods for model selection than any other library. [sent-79, score-0.124]

28 For optimization, nested grid-search and evolutionary kernel learning are supported, and efﬁcient gradient-based optimization is available whenever possible. [sent-83, score-0.353]

29 For both model training and model selection, we make use of ReClaM’s superclass architecture to describe and solve the optimization problems. [sent-84, score-0.076]

30 For example, a gradient-based optimization algorithm 994 S HARK may decrease a radius-margin quotient in order to adapt the hyperparameters of an SVM, where in each iteration an SVM model is trained by a special quadratic program optimizer to determine the margin. [sent-85, score-0.182]

31 EALib and MOO-EALib—Evolutionary Single- and Multi-objective Optimization The evolutionary algorithms module (EALib) implements classes for stochastic direct optimization using evolutionary computing, in particular genetic algorithms and evolution strategies (ESs). [sent-89, score-0.73]

32 , mutation and recombination) operators for different types of chromosomes, for example real-valued or binary vectors, are available. [sent-96, score-0.025]

33 The MOO-EALib extends the EALib to evolutionary multi-objective (i. [sent-98, score-0.242]

34 To our knowledge, the MOO-EALib module makes SHARK one of the most comprehensive libraries for EMO. [sent-102, score-0.165]

35 The efﬁcient implementation of measures for quantifying the quality of sets of candidate solutions is a strong argument for the MOO-EALib. [sent-103, score-0.054]

36 In SHARK we put an emphasis on variable-metric ESs for real-valued optimization. [sent-104, score-0.023]

37 Thus, the most recent implementation of the covariance matrix adaptation ES (CMA-ES; Hansen et al. [sent-105, score-0.079]

38 We do not know any C++ toolbox for EAs that comes close to the EALib in terms of ﬂexibility and quality of algorithms for continuous optimization. [sent-108, score-0.042]

39 No third-party libraries are required, except Qt and Qwt for graphical examples. [sent-113, score-0.058]

40 Acknowledgments The authors of this paper comprise the team responsible for a major revision and the maintenance of the SHARK library at the time of writing the article. [sent-114, score-0.202]

41 Kreutz, who wrote the basic components such as LinAlg, Array, and Rng as well as the EALib. [sent-116, score-0.058]

42 Afterwards, many people ¨ contributed to the package, in particular (in alphabetic order) R. [sent-120, score-0.025]

43 The SHARK project is supported by the Honda Research Institute Europe. [sent-133, score-0.032]

44 Reducing the time complexity of the derandomized u evolution strategy with covariance matrix adaptation (CMA-ES). [sent-165, score-0.147]

45 Gradient-based optimization of kernel-target alignment for sequence kernels applied to bacterial gene start detection. [sent-178, score-0.12]

46 Efﬁcient face detection by a cascaded supporto vector machine expansion. [sent-191, score-0.032]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('igel', 0.422), ('reclam', 0.383), ('shark', 0.383), ('glasmachers', 0.306), ('evolutionary', 0.242), ('ealib', 0.192), ('neuroinformatik', 0.153), ('library', 0.116), ('emo', 0.115), ('linalg', 0.115), ('rng', 0.115), ('rub', 0.115), ('suttorp', 0.115), ('tobias', 0.115), ('verena', 0.115), ('module', 0.107), ('array', 0.095), ('christian', 0.087), ('bochum', 0.077), ('eas', 0.077), ('eidrich', 0.077), ('eisner', 0.077), ('ess', 0.077), ('gel', 0.077), ('lasmachers', 0.077), ('romdhani', 0.077), ('rprop', 0.077), ('optimization', 0.076), ('hansen', 0.074), ('chromosomes', 0.065), ('libraries', 0.058), ('wrote', 0.058), ('adaptation', 0.054), ('optimizer', 0.053), ('quotient', 0.053), ('offers', 0.048), ('prede', 0.044), ('alignment', 0.044), ('fan', 0.042), ('toolbox', 0.042), ('modular', 0.042), ('chapelle', 0.038), ('evolution', 0.036), ('kernel', 0.035), ('writing', 0.034), ('exibility', 0.034), ('svm', 0.033), ('kit', 0.032), ('derandomized', 0.032), ('recombination', 0.032), ('vo', 0.032), ('invented', 0.032), ('toussaint', 0.032), ('cascaded', 0.032), ('gnu', 0.032), ('alberts', 0.032), ('torr', 0.032), ('project', 0.032), ('adaptive', 0.032), ('candidate', 0.031), ('soeren', 0.029), ('html', 0.029), ('resilient', 0.029), ('plenty', 0.029), ('linux', 0.029), ('revision', 0.029), ('communicate', 0.029), ('implements', 0.027), ('institut', 0.027), ('adapts', 0.027), ('inherited', 0.027), ('afterwards', 0.027), ('icann', 0.027), ('switches', 0.027), ('reproducible', 0.027), ('templates', 0.027), ('init', 0.027), ('license', 0.027), ('roth', 0.027), ('networks', 0.026), ('covariance', 0.025), ('documentation', 0.025), ('mutation', 0.025), ('populations', 0.025), ('recurrent', 0.025), ('bfgs', 0.025), ('kernelized', 0.025), ('qt', 0.025), ('contributed', 0.025), ('design', 0.025), ('parametric', 0.024), ('ms', 0.023), ('public', 0.023), ('emphasis', 0.023), ('quantifying', 0.023), ('comprises', 0.023), ('supply', 0.023), ('ller', 0.023), ('maintenance', 0.023), ('pseudo', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999994 85 jmlr-2008-Shark (Machine Learning Open Source Software Paper)

Author: Christian Igel, Verena Heidrich-Meisner, Tobias Glasmachers

2 0.067398995 8 jmlr-2008-Accelerated Neural Evolution through Cooperatively Coevolved Synapses

Author: Faustino Gomez, Jürgen Schmidhuber, Risto Miikkulainen

Abstract: Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difﬁcult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning (RL) approaches have made progress by using direct interaction with the task environment, but have so far not scaled well to large state spaces and environments that are not fully observable. In recent years, neuroevolution, the artiﬁcial evolution of neural networks, has had remarkable success in tasks that exhibit these two properties. In this paper, we compare a neuroevolution method called Cooperative Synapse Neuroevolution (CoSyNE), that uses cooperative coevolution at the level of individual synaptic weights, to a broad range of reinforcement learning algorithms on very difﬁcult versions of the pole balancing problem that involve large (continuous) state spaces and hidden state. CoSyNE is shown to be signiﬁcantly more efﬁcient and powerful than the other methods on these tasks. Keywords: coevolution, recurrent neural networks, non-linear control, genetic algorithms, experimental comparison

3 0.038905635 55 jmlr-2008-Linear-Time Computation of Similarity Measures for Sequential Data

Author: Konrad Rieck, Pavel Laskov

Abstract: Efﬁcient and expressive comparison of sequences is an essential procedure for learning with sequential data. In this article we propose a generic framework for computation of similarity measures for sequences, covering various kernel, distance and non-metric similarity functions. The basis for comparison is embedding of sequences using a formal language, such as a set of natural words, k-grams or all contiguous subsequences. As realizations of the framework we provide linear-time algorithms of different complexity and capabilities using sorted arrays, tries and sufﬁx trees as underlying data structures. Experiments on data sets from bioinformatics, text processing and computer security illustrate the efﬁciency of the proposed algorithms—enabling peak performances of up to 106 pairwise comparisons per second. The utility of distances and non-metric similarity measures for sequences as alternatives to string kernels is demonstrated in applications of text categorization, network intrusion detection and transcription site recognition in DNA. Keywords: string kernels, string distances, learning with sequential data

4 0.03566891 53 jmlr-2008-Learning to Combine Motor Primitives Via Greedy Additive Regression

Author: Manu Chhabra, Robert A. Jacobs

Abstract: The computational complexities arising in motor control can be ameliorated through the use of a library of motor synergies. We present a new model, referred to as the Greedy Additive Regression (GAR) model, for learning a library of torque sequences, and for learning the coefﬁcients of a linear combination of sequences minimizing a cost function. From the perspective of numerical optimization, the GAR model is interesting because it creates a library of “local features”—each sequence in the library is a solution to a single training task—and learns to combine these sequences using a local optimization procedure, namely, additive regression. We speculate that learners with local representational primitives and local optimization procedures will show good performance on nonlinear tasks. The GAR model is also interesting from the perspective of motor control because it outperforms several competing models. Results using a simulated two-joint arm suggest that the GAR model consistently shows excellent performance in the sense that it rapidly learns to perform novel, complex motor tasks. Moreover, its library is overcomplete and sparse, meaning that only a small fraction of the stored torque sequences are used when learning a new movement. The library is also robust in the sense that, after an initial training period, nearly all novel movements can be learned as additive combinations of sequences in the library, and in the sense that it shows good generalization when an arm’s dynamics are altered between training and test conditions, such as when a payload is added to the arm. Lastly, the GAR model works well regardless of whether motor tasks are speciﬁed in joint space or Cartesian space. We conclude that learning techniques using local primitives and optimization procedures are viable and potentially important methods for motor control and possibly other domains, and that these techniques deserve further examination by the artiﬁcial intelligence and cognitive science

5 0.033861306 29 jmlr-2008-Cross-Validation Optimization for Large Scale Structured Classification Kernel Methods

Author: Matthias W. Seeger

Abstract: We propose a highly efﬁcient framework for penalized likelihood kernel methods applied to multiclass models with a large, structured set of classes. As opposed to many previous approaches which try to decompose the ﬁtting problem into many smaller ones, we focus on a Newton optimization of the complete model, making use of model structure and linear conjugate gradients in order to approximate Newton search directions. Crucially, our learning method is based entirely on matrix-vector multiplication primitives with the kernel matrices and their derivatives, allowing straightforward specialization to new kernels, and focusing code optimization efforts to these primitives only. Kernel parameters are learned automatically, by maximizing the cross-validation log likelihood in a gradient-based way, and predictive probabilities are estimated. We demonstrate our approach on large scale text classiﬁcation tasks with hierarchical structure on thousands of classes, achieving state-of-the-art results in an order of magnitude less time than previous work. Parts of this work appeared in the conference paper Seeger (2007). Keywords: multi-way classiﬁcation, kernel logistic regression, hierarchical classiﬁcation, cross validation optimization, Newton-Raphson optimization

6 0.030618008 90 jmlr-2008-Theoretical Advantages of Lenient Learners: An Evolutionary Game Theoretic Perspective

7 0.030092265 70 jmlr-2008-On Relevant Dimensions in Kernel Feature Spaces

8 0.029802375 46 jmlr-2008-LIBLINEAR: A Library for Large Linear Classification (Machine Learning Open Source Software Paper)

9 0.027116088 89 jmlr-2008-Support Vector Machinery for Infinite Ensemble Learning

10 0.025952281 76 jmlr-2008-Optimization Techniques for Semi-Supervised Support Vector Machines

11 0.024637487 86 jmlr-2008-SimpleMKL

12 0.023150437 92 jmlr-2008-Universal Multi-Task Kernels

13 0.022731639 11 jmlr-2008-Aggregation of SVM Classifiers Using Sobolev Spaces

14 0.021858031 66 jmlr-2008-Multi-class Discriminant Kernel Learning via Convex Programming (Special Topic on Model Selection)

15 0.021749644 2 jmlr-2008-A Library for Locally Weighted Projection Regression (Machine Learning Open Source Software Paper)

16 0.021733437 32 jmlr-2008-Estimating the Confidence Interval for Prediction Errors of Support Vector Machine Classifiers

17 0.021011503 56 jmlr-2008-Magic Moments for Structured Output Prediction

18 0.020668909 58 jmlr-2008-Max-margin Classification of Data with Absent Features

19 0.020108284 51 jmlr-2008-Learning Similarity with Operator-valued Large-margin Classifiers

20 0.018617934 54 jmlr-2008-Learning to Select Features using their Properties

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.094), (1, -0.048), (2, 0.027), (3, 0.019), (4, -0.043), (5, -0.05), (6, -0.003), (7, 0.075), (8, 0.031), (9, 0.043), (10, -0.045), (11, 0.059), (12, 0.044), (13, 0.02), (14, 0.039), (15, 0.004), (16, 0.112), (17, 0.137), (18, 0.018), (19, -0.035), (20, -0.056), (21, 0.227), (22, -0.178), (23, 0.021), (24, 0.19), (25, 0.081), (26, -0.185), (27, 0.155), (28, -0.196), (29, -0.048), (30, 0.315), (31, 0.056), (32, -0.272), (33, 0.036), (34, -0.055), (35, 0.121), (36, -0.159), (37, -0.044), (38, 0.16), (39, 0.199), (40, 0.207), (41, 0.156), (42, 0.1), (43, 0.065), (44, 0.031), (45, 0.063), (46, -0.011), (47, -0.178), (48, 0.008), (49, -0.076)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95412701 85 jmlr-2008-Shark (Machine Learning Open Source Software Paper)

Author: Christian Igel, Verena Heidrich-Meisner, Tobias Glasmachers

2 0.56817716 8 jmlr-2008-Accelerated Neural Evolution through Cooperatively Coevolved Synapses

Author: Faustino Gomez, Jürgen Schmidhuber, Risto Miikkulainen

3 0.21589719 46 jmlr-2008-LIBLINEAR: A Library for Large Linear Classification (Machine Learning Open Source Software Paper)

Author: Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, Chih-Jen Lin

Abstract: LIBLINEAR is an open source library for large-scale linear classiﬁcation. It supports logistic regression and linear support vector machines. We provide easy-to-use command-line tools and library calls for users and developers. Comprehensive documents are available for both beginners and advanced users. Experiments demonstrate that LIBLINEAR is very efﬁcient on large sparse data sets. Keywords: large-scale linear classiﬁcation, logistic regression, support vector machines, open source, machine learning

4 0.18553668 55 jmlr-2008-Linear-Time Computation of Similarity Measures for Sequential Data

Author: Konrad Rieck, Pavel Laskov

5 0.1429172 76 jmlr-2008-Optimization Techniques for Semi-Supervised Support Vector Machines

Author: Olivier Chapelle, Vikas Sindhwani, Sathiya S. Keerthi

Abstract: Due to its wide applicability, the problem of semi-supervised classiﬁcation is attracting increasing attention in machine learning. Semi-Supervised Support Vector Machines (S 3 VMs) are based on applying the margin maximization principle to both labeled and unlabeled examples. Unlike SVMs, their formulation leads to a non-convex optimization problem. A suite of algorithms have recently been proposed for solving S3 VMs. This paper reviews key ideas in this literature. The performance and behavior of various S3 VM algorithms is studied together, under a common experimental setting. Keywords: semi-supervised learning, support vector machines, non-convex optimization, transductive learning

6 0.13115983 51 jmlr-2008-Learning Similarity with Operator-valued Large-margin Classifiers

7 0.12643182 29 jmlr-2008-Cross-Validation Optimization for Large Scale Structured Classification Kernel Methods

8 0.12165705 89 jmlr-2008-Support Vector Machinery for Infinite Ensemble Learning

9 0.1211608 2 jmlr-2008-A Library for Locally Weighted Projection Regression (Machine Learning Open Source Software Paper)

10 0.12027946 53 jmlr-2008-Learning to Combine Motor Primitives Via Greedy Additive Regression

11 0.11105703 86 jmlr-2008-SimpleMKL

12 0.11060162 15 jmlr-2008-An Information Criterion for Variable Selection in Support Vector Machines (Special Topic on Model Selection)

13 0.1017389 62 jmlr-2008-Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data

14 0.095623977 13 jmlr-2008-An Error Bound Based on a Worst Likely Assignment

15 0.093986779 56 jmlr-2008-Magic Moments for Structured Output Prediction

16 0.085073411 67 jmlr-2008-Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies

17 0.083663359 66 jmlr-2008-Multi-class Discriminant Kernel Learning via Convex Programming (Special Topic on Model Selection)

18 0.083316818 80 jmlr-2008-Ranking Individuals by Group Comparisons

19 0.081058294 87 jmlr-2008-Stationary Features and Cat Detection

20 0.080012977 73 jmlr-2008-On the Suitable Domain for SVM Training in Image Coding

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.015), (5, 0.021), (31, 0.024), (40, 0.023), (54, 0.021), (58, 0.02), (66, 0.022), (76, 0.019), (78, 0.611), (88, 0.043), (92, 0.024), (94, 0.053), (99, 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.86384881 85 jmlr-2008-Shark (Machine Learning Open Source Software Paper)

Author: Christian Igel, Verena Heidrich-Meisner, Tobias Glasmachers

2 0.81592947 92 jmlr-2008-Universal Multi-Task Kernels

Author: Andrea Caponnetto, Charles A. Micchelli, Massimiliano Pontil, Yiming Ying

Abstract: In this paper we are concerned with reproducing kernel Hilbert spaces HK of functions from an input space into a Hilbert space Y , an environment appropriate for multi-task learning. The reproducing kernel K associated to HK has its values as operators on Y . Our primary goal here is to derive conditions which ensure that the kernel K is universal. This means that on every compact subset of the input space, every continuous function with values in Y can be uniformly approximated by sections of the kernel. We provide various characterizations of universal kernels and highlight them with several concrete examples of some practical importance. Our analysis uses basic principles of functional analysis and especially the useful notion of vector measures which we describe in sufﬁcient detail to clarify our results. Keywords: multi-task learning, multi-task kernels, universal approximation, vector-valued reproducing kernel Hilbert spaces

3 0.2350757 66 jmlr-2008-Multi-class Discriminant Kernel Learning via Convex Programming (Special Topic on Model Selection)

Author: Jieping Ye, Shuiwang Ji, Jianhui Chen

Abstract: Regularized kernel discriminant analysis (RKDA) performs linear discriminant analysis in the feature space via the kernel trick. Its performance depends on the selection of kernels. In this paper, we consider the problem of multiple kernel learning (MKL) for RKDA, in which the optimal kernel matrix is obtained as a linear combination of pre-speciﬁed kernel matrices. We show that the kernel learning problem in RKDA can be formulated as convex programs. First, we show that this problem can be formulated as a semideﬁnite program (SDP). Based on the equivalence relationship between RKDA and least square problems in the binary-class case, we propose a convex quadratically constrained quadratic programming (QCQP) formulation for kernel learning in RKDA. A semi-inﬁnite linear programming (SILP) formulation is derived to further improve the efﬁciency. We extend these formulations to the multi-class case based on a key result established in this paper. That is, the multi-class RKDA kernel learning problem can be decomposed into a set of binary-class kernel learning problems which are constrained to share a common kernel. Based on this decomposition property, SDP formulations are proposed for the multi-class case. Furthermore, it leads naturally to QCQP and SILP formulations. As the performance of RKDA depends on the regularization parameter, we show that this parameter can also be optimized in a joint framework with the kernel. Extensive experiments have been conducted and analyzed, and connections to other algorithms are discussed. Keywords: model selection, kernel discriminant analysis, semideﬁnite programming, quadratically constrained quadratic programming, semi-inﬁnite linear programming

4 0.22844423 19 jmlr-2008-Bouligand Derivatives and Robustness of Support Vector Machines for Regression

Author: Andreas Christmann, Arnout Van Messem

Abstract: We investigate robustness properties for a broad class of support vector machines with non-smooth loss functions. These kernel methods are inspired by convex risk minimization in inﬁnite dimensional Hilbert spaces. Leading examples are the support vector machine based on the ε-insensitive loss function, and kernel based quantile regression based on the pinball loss function. Firstly, we propose with the Bouligand inﬂuence function (BIF) a modiﬁcation of F.R. Hampel’s inﬂuence function. The BIF has the advantage of being positive homogeneous which is in general not true for Hampel’s inﬂuence function. Secondly, we show that many support vector machines based on a Lipschitz continuous loss function and a bounded kernel have a bounded BIF and are thus robust in the sense of robust statistics based on inﬂuence functions. Keywords: Bouligand derivatives, empirical risk minimization, inﬂuence function, robustness, support vector machines

5 0.22227079 9 jmlr-2008-Active Learning by Spherical Subdivision

Author: Falk-Florian Henrich, Klaus Obermayer

Abstract: We introduce a computationally feasible, “constructive” active learning method for binary classiﬁcation. The learning algorithm is initially formulated for separable classiﬁcation problems, for a hyperspherical data space with constant data density, and for great spheres as classiﬁers. In order to reduce computational complexity the version space is restricted to spherical simplices and learning procedes by subdividing the edges of maximal length. We show that this procedure optimally reduces a tight upper bound on the generalization error. The method is then extended to other separable classiﬁcation problems using products of spheres as data spaces and isometries induced by charts of the sphere. An upper bound is provided for the probability of disagreement between classiﬁers (hence the generalization error) for non-constant data densities on the sphere. The emphasis of this work lies on providing mathematically exact performance estimates for active learning strategies. Keywords: active learning, spherical subdivision, error bounds, simplex halving

6 0.2051477 86 jmlr-2008-SimpleMKL

7 0.20369545 51 jmlr-2008-Learning Similarity with Operator-valued Large-margin Classifiers

8 0.20304862 8 jmlr-2008-Accelerated Neural Evolution through Cooperatively Coevolved Synapses

9 0.19940434 27 jmlr-2008-Consistency of the Group Lasso and Multiple Kernel Learning

10 0.19063364 36 jmlr-2008-Finite-Time Bounds for Fitted Value Iteration

11 0.18569072 58 jmlr-2008-Max-margin Classification of Data with Absent Features

12 0.18557671 89 jmlr-2008-Support Vector Machinery for Infinite Ensemble Learning

13 0.17745216 11 jmlr-2008-Aggregation of SVM Classifiers Using Sobolev Spaces

14 0.17518955 55 jmlr-2008-Linear-Time Computation of Similarity Measures for Sequential Data

15 0.17421141 81 jmlr-2008-Regularization on Graphs with Function-adapted Diffusion Processes

16 0.17358096 63 jmlr-2008-Model Selection for Regression with Continuous Kernel Functions Using the Modulus of Continuity (Special Topic on Model Selection)

17 0.16900179 31 jmlr-2008-Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction

18 0.16753313 56 jmlr-2008-Magic Moments for Structured Output Prediction

19 0.16598049 74 jmlr-2008-Online Learning of Complex Prediction Problems Using Simultaneous Projections

20 0.16595683 94 jmlr-2008-Value Function Approximation using Multiple Aggregation for Multiattribute Resource Management