jmlr jmlr2012 jmlr2012-31 knowledge-graph by maker-knowledge-mining

31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

Source: pdf

Author: Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, Christian Gagné

Abstract: DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. Its design departs from most other existing frameworks in that it seeks to make algorithms explicit and data structures transparent, as opposed to the more common black-box frameworks. Freely available with extensive documentation at http://deap.gel.ulaval.ca, DEAP is an open source project under an LGPL license. Keywords: distributed evolutionary algorithms, software tools

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Its design departs from most other existing frameworks in that it seeks to make algorithms explicit and data structures transparent, as opposed to the more common black-box frameworks. [sent-17, score-0.191]

2 Introduction Evolutionary Computation (EC) is a sophisticated ﬁeld with very diverse techniques and mechanisms, where even well designed frameworks can become quite complicated under the hood. [sent-23, score-0.158]

3 They thus strive to hide the implementation details as much as possible, by providing large libraries of high-level functionalities, often in many different ﬂavours. [sent-24, score-0.028]

4 The more elaborate these boxes become, the more obscure they are, and the less likely the commoner is to ever take a peek under the hood to consider making changes. [sent-26, score-0.028]

5 But using EC to solve real-world problems most often requires the customization of algorithms. [sent-27, score-0.067]

6 The DEAP (Distributed Evolutionary Algorithms in Python) framework is built over the Python programming language that provides the essential glue for assembling sophisticated EC systems. [sent-28, score-0.028]

7 Its aim is to provide practical tools for rapid prototyping of custom evolutionary algorithms, where every step of the process is as explicit (pseudocode like) and easy to read and understand as possible. [sent-29, score-0.453]

8 Core Architecture DEAP’s core is composed of two simple structures: a creator and a toolbox. [sent-32, score-0.162]

9 The creator module is a meta-factory that allows the run-time creation of classes via both inheritance and composition. [sent-33, score-0.316]

10 Practically speaking, this allows the creation of genotypes and populations from any data structure such as lists, sets, dictionaries, trees, etc. [sent-36, score-0.038]

11 This creator concept is key to facilitating the implementation of any type of EA, including genetic algorithms (Mitchell, 1998), genetic programming (Banzhaf et al. [sent-37, score-0.308]

12 , 1998), evolution strategies (Beyer and Schwefel, 2002), covariance matrix adaptation evolution strategy (Hansen and Ostermeier, 2001), particle swarm optimization (Kennedy and Eberhart, 2001), and many more. [sent-38, score-0.251]

13 The toolbox is a container for the tools (operators) that the user wants to use in his EA. [sent-39, score-0.25]

14 The toolbox is manually populated by the user with selected tools. [sent-40, score-0.195]

15 For instance, if the user needs a crossover in his algorithm, but has access to several crossover types, he will choose the one best suited for his current problem, for example a uniform crossover “cxUniform”, and register it into the toolbox using a generic “mate” alias. [sent-41, score-0.654]

16 If he later decides that some other crossover is better suited, his algorithm will remain unchanged, he will only need to update the corresponding alias in the toolbox. [sent-43, score-0.342]

17 The core functionalities of DEAP are levered by several peripheral modules. [sent-44, score-0.157]

18 The algorithms module contains four classical EC algorithms: generational, (µ , λ), (µ + λ), and ask-and-tell (Collette et al. [sent-45, score-0.15]

19 These serve as a starting point for users to build their own custom EAs meeting their speciﬁc needs. [sent-47, score-0.071]

20 The tools module provides basic EC operators such as initializations, mutations, crossovers, and selections. [sent-48, score-0.269]

21 These operators can be directly added to a toolbox in order to be used in algorithms. [sent-49, score-0.174]

22 This module also contains a number of components that gather useful information regarding the evolution: ﬁtness statistics, genealogy, hall-of-fame for the best individuals encountered so far, and checkpointing mechanisms to allow evolution restart. [sent-50, score-0.361]

23 A base module contains some data structures frequently used in EC, but not implemented in standard Python (e. [sent-51, score-0.183]

24 The last module named dtm, for Distributed Task Manager, offers distributed substitutes for common Python functions such as apply and map. [sent-54, score-0.15]

25 It even balances the workload among workers to optimize the distribution efﬁciency. [sent-56, score-0.095]

26 The individual is represented as a bit-string where each bit corresponds to a feature that can be selected or not. [sent-58, score-0.026]

27 On line 2, the relevant DEAP modules are ﬁrst imported. [sent-61, score-0.033]

28 Next, an Individual class is derived from the Python list and composed with our newly created FitnessMulti object. [sent-69, score-0.029]

29 The toolbox register method accepts a variable number of arguments. [sent-71, score-0.203]

30 The ﬁrst is the alias name, and the second the function that we want to associate with this alias. [sent-72, score-0.2]

31 All other arguments are passed to this function when the alias is called. [sent-73, score-0.228]

32 randint with arguments 0 and 1, and thus generate a random bit. [sent-76, score-0.028]

33 tion problem, this bit function is simply called n = 80 times repeatedly using the tools. [sent-79, score-0.026]

34 initRepeat method that accepts three arguments: a container, a function, and the number of times to repeat initialization. [sent-80, score-0.042]

35 Similarly, a population alias is deﬁned to allow population initialization, in this case with n = 100 individuals. [sent-81, score-0.392]

36 Line 18 then proceeds with the allocation of a population of individuals that have their ﬁtness evaluated on line 19 by mapping the evaluation function to every element of the population container. [sent-82, score-0.28]

37 Lines 20 and 21 replace the individuals ﬁtness with their newly computed values. [sent-83, score-0.117]

38 Finally, lines 23 through 28 show the evolutionary loop that uses the µ + λ strategy, where µ = 100 parents (current population) are mixed with λ = 100 offspring (lambda line 24) for the selection process (NSGA-II) to produce the next generation of k = 100 parents (line 28). [sent-84, score-0.412]

39 The varOr algorithm loops until it has generated λ offspring using either crossover (mate alias) with probability cxpb, mutation (mutate alias) with probability mutpb, or reproduction. [sent-85, score-0.209]

40 Efﬁcient distribution of speciﬁc parts of user algorithms is made possible by the dtm module. [sent-86, score-0.24]

41 The task manager distributes these sub-tasks across the worker nodes, and automatically balances the workload evenly among them. [sent-88, score-0.18]

42 Table 1 presents a comparison of the number of lines required by some of the most popular frameworks to deﬁne different components of a OneMax example (without distribution). [sent-92, score-0.261]

43 Columns respectively indicate how many lines are required to deﬁne the bit-string type, to conﬁgure the operators, to implement the generational algorithm and to execute the example. [sent-93, score-0.17]

44 DEAP is the only framework that allows the complete deﬁnition of the EA in less than one hundred lines of code. [sent-94, score-0.103]

45 Even though some frameworks may allow shorter expressions of standardized solutions, any needed customization of type or algorithm can also force the user to dig deep into the framework to modify perhaps hundreds of lines. [sent-95, score-0.265]

46 We therefore assert that DEAP is superior to previous frameworks for rapid prototyping of new algorithms and deﬁnition of custom types. [sent-96, score-0.396]

47 Conclusion Current major EC frameworks generally do a good job of offering generic tools to solve hard problems using EAs. [sent-98, score-0.234]

48 Even experts can become overwhelmed when trying to implement speciﬁc features. [sent-100, score-0.028]

49 The DEAP core is implemented in 6 program ﬁles (Python modules) with only 1653 lines of code. [sent-102, score-0.165]

50 To these core lines, the DTM module adds another 2073 lines for enabling the easy distribution of computationally intensive algorithm parts. [sent-103, score-0.315]

51 The framework also includes more than 35 application examples that add another 2448 lines of code, leading to an average ratio of core lines to example lines of 1. [sent-104, score-0.371]

52 In other words, DEAP implements the 15 examples of OpenBEAGLE plus 20 more with 10 times less lines of code. [sent-111, score-0.103]

53 Acknowledgments This work has been supported by the Fonds de recherche du Qu´ bec - Nature et technologies e (FQRNT) and by the Natural Sciences and Engineering Research Council of Canada (NSERC). [sent-112, score-0.085]

54 On object-oriented programming of optimizers – Examples in Scilab. [sent-135, score-0.028]

55 A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. [sent-146, score-0.09]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('deap', 0.532), ('tness', 0.212), ('alias', 0.2), ('dtm', 0.2), ('ec', 0.177), ('parizeau', 0.166), ('ulaval', 0.166), ('evolutionary', 0.166), ('frameworks', 0.158), ('module', 0.15), ('python', 0.146), ('crossover', 0.142), ('prototyping', 0.114), ('toolbox', 0.104), ('lines', 0.103), ('creator', 0.1), ('fortin', 0.1), ('rainville', 0.1), ('population', 0.096), ('evolution', 0.092), ('genetic', 0.09), ('individuals', 0.088), ('bec', 0.085), ('gardner', 0.085), ('mate', 0.085), ('marc', 0.075), ('custom', 0.071), ('qu', 0.071), ('operators', 0.07), ('ade', 0.067), ('agn', 0.067), ('ainville', 0.067), ('ardner', 0.067), ('banzhaf', 0.067), ('beyer', 0.067), ('collette', 0.067), ('customization', 0.067), ('deb', 0.067), ('ecj', 0.067), ('fitnessmulti', 0.067), ('functionalities', 0.067), ('gagn', 0.067), ('gel', 0.067), ('generational', 0.067), ('inspyred', 0.067), ('mutate', 0.067), ('nie', 0.067), ('offspring', 0.067), ('onemax', 0.067), ('ortin', 0.067), ('pyevolve', 0.067), ('swarm', 0.067), ('volutionary', 0.067), ('ea', 0.066), ('core', 0.062), ('hansen', 0.062), ('christian', 0.059), ('container', 0.057), ('asy', 0.057), ('kennedy', 0.057), ('manager', 0.057), ('register', 0.057), ('rapid', 0.053), ('eo', 0.051), ('populated', 0.051), ('workload', 0.051), ('tools', 0.049), ('roberts', 0.047), ('transparent', 0.047), ('balances', 0.044), ('accepts', 0.042), ('user', 0.04), ('lgorithms', 0.039), ('parents', 0.038), ('creation', 0.038), ('ca', 0.035), ('modules', 0.033), ('structures', 0.033), ('mechanisms', 0.031), ('newly', 0.029), ('peripheral', 0.028), ('strive', 0.028), ('mutations', 0.028), ('overwhelmed', 0.028), ('addison', 0.028), ('fonds', 0.028), ('hood', 0.028), ('inheritance', 0.028), ('keller', 0.028), ('lambda', 0.028), ('ostermeier', 0.028), ('syst', 0.028), ('wesley', 0.028), ('worker', 0.028), ('programming', 0.028), ('arguments', 0.028), ('generic', 0.027), ('kaufmann', 0.027), ('bit', 0.026), ('dictionaries', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

Author: Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, Christian Gagné

2 0.065119788 90 jmlr-2012-Pattern for Python

Author: Tom De Smedt, Walter Daelemans

Abstract: Pattern is a package for Python 2.4+ with functionality for web mining (Google + Twitter + Wikipedia, web spider, HTML DOM parser), natural language processing (tagger/chunker, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, k-means clustering, Naive Bayes + k-NN + SVM classiﬁers) and network analysis (graph centrality and visualization). It is well documented and bundled with 30+ examples and 350+ unit tests. The source code is licensed under BSD and available from http://www.clips.ua.ac.be/pages/pattern. Keywords: Python, data mining, natural language processing, machine learning, graph networks

3 0.059492171 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

Author: Stephen R. Piccolo, Lewis J. Frey

Abstract: Motivated by a need to classify high-dimensional, heterogeneous data from the bioinformatics domain, we developed ML-Flex, a machine-learning toolbox that enables users to perform two-class and multi-class classiﬁcation analyses in a systematic yet ﬂexible manner. ML-Flex was written in Java but is capable of interfacing with third-party packages written in other programming languages. It can handle multiple input-data formats and supports a variety of customizations. MLFlex provides implementations of various validation strategies, which can be executed in parallel across multiple computing cores, processors, and nodes. Additionally, ML-Flex supports aggregating evidence across multiple algorithms and data sets via ensemble learning. This open-source software package is freely available from http://mlflex.sourceforge.net. Keywords: toolbox, classiﬁcation, parallel, ensemble, reproducible research

4 0.052465536 21 jmlr-2012-Bayesian Mixed-Effects Inference on Classification Performance in Hierarchical Data Sets

Author: Kay H. Brodersen, Christoph Mathys, Justin R. Chumbley, Jean Daunizeau, Cheng Soon Ong, Joachim M. Buhmann, Klaas E. Stephan

Abstract: Classiﬁcation algorithms are frequently used on data with a natural hierarchical structure. For instance, classiﬁers are often trained and tested on trial-wise measurements, separately for each subject within a group. One important question is how classiﬁcation outcomes observed in individual subjects can be generalized to the population from which the group was sampled. To address this question, this paper introduces novel statistical models that are guided by three desiderata. First, all models explicitly respect the hierarchical nature of the data, that is, they are mixed-effects models that simultaneously account for within-subjects (ﬁxed-effects) and across-subjects (random-effects) variance components. Second, maximum-likelihood estimation is replaced by full Bayesian inference in order to enable natural regularization of the estimation problem and to afford conclusions in terms of posterior probability statements. Third, inference on classiﬁcation accuracy is complemented by inference on the balanced accuracy, which avoids inﬂated accuracy estimates for imbalanced data sets. We introduce hierarchical models that satisfy these criteria and demonstrate their advantages over conventional methods using MCMC implementations for model inversion and model selection on both synthetic and empirical data. We envisage that our approach will improve the sensitivity and validity of statistical inference in future hierarchical classiﬁcation studies. Keywords: beta-binomial, normal-binomial, balanced accuracy, Bayesian inference, group studies

5 0.044576302 107 jmlr-2012-Smoothing Multivariate Performance Measures

Author: Xinhua Zhang, Ankan Saha, S.V.N. Vishwanathan

Abstract: Optimizing multivariate performance measure is an important task in Machine Learning. Joachims (2005) introduced a Support Vector Method whose underlying optimization problem is commonly solved by cutting plane methods (CPMs) such as SVM-Perf and BMRM. It can be shown that CPMs 1 converge to an ε accurate solution in O λε iterations, where λ is the trade-off parameter between the regularizer and the loss function. Motivated by the impressive convergence rate of CPM on a number of practical problems, it was conjectured that these rates can be further improved. We disprove this conjecture in this paper by constructing counter examples. However, surprisingly, we further discover that these problems are not inherently hard, and we develop a novel smoothing strategy, which in conjunction with Nesterov’s accelerated gradient method, can ﬁnd an ε accuiterations. Computationally, our smoothing technique is also rate solution in O∗ min 1 , √1 ε λε particularly advantageous for optimizing multivariate performance scores such as precision/recall break-even point and ROCArea; the cost per iteration remains the same as that of CPMs. Empirical evaluation on some of the largest publicly available data sets shows that our method converges signiﬁcantly faster than CPMs without sacriﬁcing generalization ability. Keywords: non-smooth optimization, max-margin methods, multivariate performance measures, Support Vector Machines, smoothing

6 0.042706564 57 jmlr-2012-Learning Symbolic Representations of Hybrid Dynamical Systems

7 0.04252832 118 jmlr-2012-Variational Multinomial Logit Gaussian Process

8 0.037183661 79 jmlr-2012-Oger: Modular Learning Architectures For Large-Scale Sequential Processing

9 0.030916339 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

10 0.02992022 75 jmlr-2012-NIMFA : A Python Library for Nonnegative Matrix Factorization

11 0.026874611 38 jmlr-2012-Entropy Search for Information-Efficient Global Optimization

12 0.020893637 74 jmlr-2012-Multi Kernel Learning with Online-Batch Optimization

13 0.019679429 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

14 0.018828373 95 jmlr-2012-Random Search for Hyper-Parameter Optimization

15 0.018789282 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

16 0.018113537 9 jmlr-2012-A Topic Modeling Toolbox Using Belief Propagation

17 0.016607549 47 jmlr-2012-GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression

18 0.015649531 102 jmlr-2012-Sally: A Tool for Embedding Strings in Vector Spaces

19 0.015396557 88 jmlr-2012-PREA: Personalized Recommendation Algorithms Toolkit

20 0.013647195 53 jmlr-2012-Jstacs: A Java Framework for Statistical Analysis and Classification of Biological Sequences

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.069), (1, 0.028), (2, 0.126), (3, -0.03), (4, 0.053), (5, 0.036), (6, 0.078), (7, 0.025), (8, -0.081), (9, -0.018), (10, -0.063), (11, -0.037), (12, 0.132), (13, -0.01), (14, -0.09), (15, 0.03), (16, 0.009), (17, -0.001), (18, -0.124), (19, -0.17), (20, 0.029), (21, 0.063), (22, -0.102), (23, 0.018), (24, -0.143), (25, -0.343), (26, -0.017), (27, -0.032), (28, -0.021), (29, -0.034), (30, -0.212), (31, 0.085), (32, -0.016), (33, -0.22), (34, -0.078), (35, -0.001), (36, 0.176), (37, -0.003), (38, -0.008), (39, -0.04), (40, 0.059), (41, 0.022), (42, 0.114), (43, -0.021), (44, -0.059), (45, -0.036), (46, 0.01), (47, 0.097), (48, 0.023), (49, 0.093)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97295409 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

Author: Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, Christian Gagné

2 0.61303896 107 jmlr-2012-Smoothing Multivariate Performance Measures

Author: Xinhua Zhang, Ankan Saha, S.V.N. Vishwanathan

3 0.47613457 90 jmlr-2012-Pattern for Python

Author: Tom De Smedt, Walter Daelemans

4 0.45600981 79 jmlr-2012-Oger: Modular Learning Architectures For Large-Scale Sequential Processing

Author: David Verstraeten, Benjamin Schrauwen, Sander Dieleman, Philemon Brakel, Pieter Buteneers, Dejan Pecevski

Abstract: Oger (OrGanic Environment for Reservoir computing) is a Python toolbox for building, training and evaluating modular learning architectures on large data sets. It builds on MDP for its modularity, and adds processing of sequential data sets, gradient descent training, several crossvalidation schemes and parallel parameter optimization methods. Additionally, several learning algorithms are implemented, such as different reservoir implementations (both sigmoid and spiking), ridge regression, conditional restricted Boltzmann machine (CRBM) and others, including GPU accelerated versions. Oger is released under the GNU LGPL, and is available from http: //organic.elis.ugent.be/oger. Keywords: Python, modular architectures, sequential processing

5 0.36317044 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

Author: Stephen R. Piccolo, Lewis J. Frey

6 0.30434424 57 jmlr-2012-Learning Symbolic Representations of Hybrid Dynamical Systems

7 0.2624822 21 jmlr-2012-Bayesian Mixed-Effects Inference on Classification Performance in Hierarchical Data Sets

8 0.20818903 118 jmlr-2012-Variational Multinomial Logit Gaussian Process

9 0.17450461 95 jmlr-2012-Random Search for Hyper-Parameter Optimization

10 0.16034786 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

11 0.15478444 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

12 0.12069041 93 jmlr-2012-Quantum Set Intersection and its Application to Associative Memory

13 0.12034672 75 jmlr-2012-NIMFA : A Python Library for Nonnegative Matrix Factorization

14 0.11081687 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

15 0.1096179 102 jmlr-2012-Sally: A Tool for Embedding Strings in Vector Spaces

16 0.10596324 38 jmlr-2012-Entropy Search for Information-Efficient Global Optimization

17 0.10502796 12 jmlr-2012-Active Clustering of Biological Sequences

18 0.10476618 22 jmlr-2012-Bounding the Probability of Error for High Precision Optical Character Recognition

19 0.10335042 6 jmlr-2012-A Model of the Perception of Facial Expressions of Emotion by Humans: Research Overview and Perspectives

20 0.098547868 16 jmlr-2012-Algorithms for Learning Kernels Based on Centered Alignment

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.011), (21, 0.721), (26, 0.016), (27, 0.017), (29, 0.014), (49, 0.013), (56, 0.039), (75, 0.018), (92, 0.017), (96, 0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94169164 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

Author: Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, Christian Gagné

2 0.81562167 39 jmlr-2012-Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

Author: Jian Huang, Cun-Hui Zhang

Abstract: The ℓ1 -penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1 -penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1 -penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. Keywords: variable selection, penalized estimation, oracle inequality, generalized linear models, selection consistency, sparsity

3 0.76576018 20 jmlr-2012-Analysis of a Random Forests Model

Author: Gérard Biau

Abstract: Random forests are a scheme proposed by Leo Breiman in the 2000’s for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite growing interest and practical use, there has been little exploration of the statistical properties of random forests, and little is known about the mathematical forces driving the algorithm. In this paper, we offer an in-depth analysis of a random forests model suggested by Breiman (2004), which is very close to the original algorithm. We show in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present. Keywords: random forests, randomization, sparsity, dimension reduction, consistency, rate of convergence

4 0.37596738 117 jmlr-2012-Variable Selection in High-dimensional Varying-coefficient Models with Global Optimality

Author: Lan Xue, Annie Qu

Abstract: The varying-coefﬁcient model is ﬂexible and powerful for modeling the dynamic changes of regression coefﬁcients. It is important to identify signiﬁcant covariates associated with response variables, especially for high-dimensional settings where the number of covariates can be larger than the sample size. We consider model selection in the high-dimensional setting and adopt difference convex programming to approximate the L0 penalty, and we investigate the global optimality properties of the varying-coefﬁcient estimator. The challenge of the variable selection problem here is that the dimension of the nonparametric form for the varying-coefﬁcient modeling could be inﬁnite, in addition to dealing with the high-dimensional linear covariates. We show that the proposed varying-coefﬁcient estimator is consistent, enjoys the oracle property and achieves an optimal convergence rate for the non-zero nonparametric components for high-dimensional data. Our simulations and numerical examples indicate that the difference convex algorithm is efﬁcient using the coordinate decent algorithm, and is able to select the true model at a higher frequency than the least absolute shrinkage and selection operator (LASSO), the adaptive LASSO and the smoothly clipped absolute deviation (SCAD) approaches. Keywords: coordinate decent algorithm, difference convex programming, L0 - regularization, large-p small-n, model selection, nonparametric function, oracle property, truncated L1 penalty

5 0.32920077 7 jmlr-2012-A Multi-Stage Framework for Dantzig Selector and LASSO

Author: Ji Liu, Peter Wonka, Jieping Ye

Abstract: We consider the following sparse signal recovery (or feature selection) problem: given a design matrix X ∈ Rn×m (m ≫ n) and a noisy observation vector y ∈ Rn satisfying y = Xβ∗ + ε where ε is the noise vector following a Gaussian distribution N(0, σ2 I), how to recover the signal (or parameter vector) β∗ when the signal is sparse? The Dantzig selector has been proposed for sparse signal recovery with strong theoretical guarantees. In this paper, we propose a multi-stage Dantzig selector method, which iteratively reﬁnes the target signal β∗ . We show that if X obeys a certain condition, then with a large probability ˆ the difference between the solution β estimated by the proposed method and the true solution β∗ measured in terms of the ℓ p norm (p ≥ 1) is bounded as ˆ β − β∗ p ≤ C(s − N)1/p log m + ∆ σ, where C is a constant, s is the number of nonzero entries in β∗ , the risk of the oracle estimator ∆ is independent of m and is much smaller than the ﬁrst term, and N is the number of entries of β∗ larger √ than a certain value in the order of O (σ log m). The proposed √ method improves the estimation √ bound of the standard Dantzig selector approximately from Cs1/p log mσ to C(s − N)1/p log mσ where the value N depends on the number of large entries in β∗ . When N = s, the proposed algorithm achieves the oracle solution with a high probability, where the oracle solution is the projection of the observation vector y onto true features. In addition, with a large probability, the proposed method can select the same number of correct features under a milder condition than the Dantzig selector. Finally, we extend this multi-stage procedure to the LASSO case. Keywords: multi-stage, Dantzig selector, LASSO, sparse signal recovery

6 0.3276813 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

7 0.32532015 29 jmlr-2012-Consistent Model Selection Criteria on High Dimensions

8 0.32317859 2 jmlr-2012-A Comparison of the Lasso and Marginal Regression

9 0.28204903 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

10 0.27293861 8 jmlr-2012-A Primal-Dual Convergence Analysis of Boosting

11 0.26678127 69 jmlr-2012-Mixability is Bayes Risk Curvature Relative to Log Loss

12 0.2542806 73 jmlr-2012-Multi-task Regression using Minimal Penalties

13 0.24792717 111 jmlr-2012-Structured Sparsity and Generalization

14 0.2462135 57 jmlr-2012-Learning Symbolic Representations of Hybrid Dynamical Systems

15 0.23849879 18 jmlr-2012-An Improved GLMNET for L1-regularized Logistic Regression

16 0.23618436 72 jmlr-2012-Multi-Target Regression with Rule Ensembles

17 0.23488881 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

18 0.23208293 63 jmlr-2012-Mal-ID: Automatic Malware Detection Using Common Segment Analysis and Meta-Features

19 0.22529197 96 jmlr-2012-Refinement of Operator-valued Reproducing Kernels

20 0.22125691 64 jmlr-2012-Manifold Identification in Dual Averaging for Regularized Stochastic Online Learning