jmlr jmlr2013 jmlr2013-112 knowledge-graph by maker-knowledge-mining

112 jmlr-2013-Tapkee: An Efficient Dimension Reduction Library


Source: pdf

Author: Sergey Lisitsyn, Christian Widmer, Fernando J. Iglesias Garcia

Abstract: We present Tapkee, a C++ template library that provides efficient implementations of more than 20 widely used dimensionality reduction techniques ranging from Locally Linear Embedding (Roweis and Saul, 2000) and Isomap (de Silva and Tenenbaum, 2002) to the recently introduced BarnesHut-SNE (van der Maaten, 2013). Our library was designed with a focus on performance and flexibility. For performance, we combine efficient multi-core algorithms, modern data structures and state-of-the-art low-level libraries. To achieve flexibility, we designed a clean interface for applying methods to user data and provide a callback API that facilitates integration with the library. The library is freely available as open-source software and is distributed under the permissive BSD 3-clause license. We encourage the integration of Tapkee into other open-source toolboxes and libraries. For example, Tapkee has been integrated into the codebase of the Shogun toolbox (Sonnenburg et al., 2010), giving us access to a rich set of kernels, distance measures and bindings to common programming languages including Python, Octave, Matlab, R, Java, C#, Ruby, Perl and Lua. Source code, examples and documentation are available at http://tapkee.lisitsyn.me. Keywords: dimensionality reduction, machine learning, C++, open source software

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Our library was designed with a focus on performance and flexibility. [sent-9, score-0.215]

2 For performance, we combine efficient multi-core algorithms, modern data structures and state-of-the-art low-level libraries. [sent-10, score-0.067]

3 To achieve flexibility, we designed a clean interface for applying methods to user data and provide a callback API that facilitates integration with the library. [sent-11, score-0.209]

4 The library is freely available as open-source software and is distributed under the permissive BSD 3-clause license. [sent-12, score-0.323]

5 We encourage the integration of Tapkee into other open-source toolboxes and libraries. [sent-13, score-0.115]

6 For example, Tapkee has been integrated into the codebase of the Shogun toolbox (Sonnenburg et al. [sent-14, score-0.098]

7 , 2010), giving us access to a rich set of kernels, distance measures and bindings to common programming languages including Python, Octave, Matlab, R, Java, C#, Ruby, Perl and Lua. [sent-15, score-0.121]

8 Source code, examples and documentation are available at http://tapkee. [sent-16, score-0.04]

9 Keywords: dimensionality reduction, machine learning, C++, open source software 1. [sent-19, score-0.137]

10 Introduction The aim of dimension reduction is to find low-dimensional representations of data to facilitate data visualization, interpretation and preprocessing for further analysis. [sent-20, score-0.095]

11 Our library, Tapkee, provides numerous efficient implementations of both linear and non-linear dimension reduction algorithms, ranging from established methods to more recent developments in the field. [sent-22, score-0.184]

12 Implementations of modern dimensionality reduction algorithms are available in several other open-source toolboxes, such as Scikit-learn (Pedregosa et al. [sent-23, score-0.157]

13 L ISITSYN , W IDMER AND I GLESIAS Matlab Toolbox for Dimensionality Reduction by Laurens van der Maaten. [sent-29, score-0.099]

14 In contrast to existing toolkits, our aim is to provide a generic C++ library in the spirit of the Standard Template Library to allow for greater flexibility. [sent-30, score-0.215]

15 , externally defined functions may be passed to the library as arguments), which naturally enables the user to combine our library with custom distance or similarity measures, which are at the core of the most dimensionality reduction algorithms. [sent-33, score-0.883]

16 For this, we combine advanced data structures (such as cover tree by Beygelzimer et al. [sent-35, score-0.094]

17 , 2006), with efficient libraries (as ARPACK) and carefully engineered, well-tested C++ code. [sent-36, score-0.085]

18 Furthermore, we provide parallel implementations of many algorithms using OpenMP and preview preliminary support for GPU computing. [sent-37, score-0.089]

19 To support the user in the choice of algorithms, we provide mathematical background for each method on the project website. [sent-40, score-0.037]

20 To facilitate the use of Tapkee for the end-user, we distribute our library as part of the Shogun toolbox, providing access to its language bindings to such languages as Python, MATLAB, Java and a rich set of kernel and distance functions. [sent-41, score-0.336]

21 The integration of Tapkee into C++ projects or other toolkits is therefore highly encouraged by its software design (a template library with callbacks and only few dependencies) as well as the choice of a permissive software license (BSD). [sent-43, score-0.699]

22 Source code, documentation and various graphical and code examples can be found on the project website: http://tapkee. [sent-44, score-0.101]

23 Implemented Algorithms Currently, our toolkit provides implementations of the following algorithms: 1. [sent-48, score-0.142]

24 Classic distance-based methods such as Multidimensional Scaling and Isomap along with its landmark approximations such as Landmark Multidimensional Scaling and Landmark Isomap (de Silva and Tenenbaum, 2002); 3. [sent-53, score-0.064]

25 Iterative methods such as Stochastic Proximity Embedding (Agrafiotis, 2003), Factor Analysis, t-SNE and Barnes-Hut-SNE (van der Maaten, 2013); 5. [sent-55, score-0.065]

26 Software Design Tapkee is a C++ pure template library with a flexible, modular structure. [sent-59, score-0.324]

27 In practice, this allows easier integration of the library code into existing projects without the need for linking. [sent-60, score-0.357]

28 Furthermore, the library code is specialized during compilation with capabilities of compile-time optimization and customization (e. [sent-61, score-0.302]

29 principle, our library is engineered to be callback-based (with externally defined functions passed to the library as arguments). [sent-65, score-0.577]

30 The algorithms in our library are implemented as generically as possible (formulated in terms of callback functions) with callbacks providing the interface between the user data and algorithm. [sent-66, score-0.432]

31 For example, the user may want to embed biological sequences using a string kernel from a third party library, which—using callbacks—can be passed to Tapkee as a custom similarity measure. [sent-67, score-0.279]

32 A major benefit of our callback-centric design is the improved re-usability of our software, as user-defined callback functions may contain complex custom code or operate on custom data structures without any changes to our library code. [sent-68, score-0.682]

33 Most dimensionality reduction algorithms rely on linear algebra operations, nearest neighbor finding and the solving of eigenproblems. [sent-69, score-0.184]

34 To address the need for high performance linear algebra we leverage the Eigen3 template library. [sent-70, score-0.105]

35 This makes our implementations safe, easy to read and fast. [sent-71, score-0.089]

36 Next, we approach the nearest neighbor problem with the vantage point tree and the cover tree data structures (Beygelzimer et al. [sent-72, score-0.059]

37 Finally, to solve eigenproblems arising in most of the implemented algorithms we employ three methods: QR decomposition, Lanczos method from the ARPACK library and a randomized method described in Halko et al. [sent-74, score-0.215]

38 The library supports all major platforms (Linux, Mac OS X and Windows). [sent-76, score-0.215]

39 To ensure quality we combine unit-testing and continuous integration with Travis. [sent-77, score-0.09]

40 Applications We have successfully applied Tapkee to compute low-dimensional representations of various data sets including images and biological sequences. [sent-79, score-0.033]

41 To demonstrate an embedding of string objects, we computed a 2D embedding of English words and biological sequences (gene starts) for several organisms. [sent-81, score-0.387]

42 See Figure 1 for examples of 2D representations computed with Tapkee and the website for more graphical examples. [sent-82, score-0.03]

43 At the time of writing, we are aware of several successful applications of Tapkee to problems in hydrodynamics, bioinformatics and exploratory analysis of energy landscapes, from user feedback. [sent-83, score-0.037]

44 Comparison To Related Work We have compared our implementations with their counterparts in Scikit-learn 0. [sent-85, score-0.089]

45 For Tapkee, results for 4 threads and for a single thread (in parentheses) are shown. [sent-129, score-0.037]

46 parison1 for implementations of two widely-used methods, which were present in all of the above libraries: Locally Linear Embedding and Isomap. [sent-130, score-0.089]

47 For this comparison we used four data sets: a subset of MNIST (2000 vectors of size 784), swissroll with 5000 3D vectors, images from MITCBCL face recognition data set (384 vectors of size 40,000) and subsampled hyperspectral image obtained from the AVIRIS project (2520 vectors of size 224). [sent-131, score-0.077]

48 We note that when using a single thread Tapkee outperforms the other implementations in all but one case. [sent-133, score-0.126]

49 When using 4 cores, Tapkee is considerably faster and outperforms the other libraries in all experiments. [sent-134, score-0.085]

50 In summary, we find that the choice of algorithms and low-level libraries enables Tapkee to outperform other implementations under various conditions. [sent-138, score-0.174]

51 For the code to reproduce the above experiments (and additional benchmarks) and an overview of available methods for each toolkit, please see our supplementary webpage http://iglesias. [sent-139, score-0.061]

52 Conclusion We have implemented an efficient and flexible library for dimension reduction with all the methods implemented using state-of-the-art algorithms and data structures. [sent-143, score-0.31]

53 Our library readily handles big data sets and facilitates interaction with custom code using a callback-centric API. [sent-144, score-0.431]

54 To compare our toolkit to existing software, we provide a speed comparison on a range of data sets showing considerable speed-up. [sent-145, score-0.053]

55 We believe the infrastructure provided by Tapkee (such as API, modular design, bindings to libraries) can serve as a platform for further dimension reduction development. [sent-146, score-0.216]

56 Laplacian eigenmaps and spectral techniques for embedding and clustering. [sent-173, score-0.206]

57 Hessian eigenmaps: Locally linear embedding techniques for highdimensional data. [sent-197, score-0.159]

58 Linear local tangent space alignment and application to face recognition. [sent-274, score-0.112]

59 Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. [sent-280, score-0.222]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('tapkee', 0.718), ('library', 0.215), ('waf', 0.179), ('embedding', 0.159), ('isomap', 0.13), ('custom', 0.128), ('lle', 0.102), ('reduction', 0.095), ('bindings', 0.09), ('callback', 0.09), ('callbacks', 0.09), ('lisitsyn', 0.09), ('mtfdr', 0.09), ('widmer', 0.09), ('implementations', 0.089), ('libraries', 0.085), ('template', 0.078), ('sonnenburg', 0.077), ('pedregosa', 0.077), ('swissroll', 0.077), ('toolbox', 0.072), ('maaten', 0.069), ('halko', 0.069), ('shogun', 0.069), ('der', 0.065), ('tangent', 0.065), ('landmark', 0.064), ('dimensionality', 0.062), ('code', 0.061), ('agra', 0.06), ('arpack', 0.06), ('aviris', 0.06), ('gashler', 0.06), ('glesias', 0.06), ('idmer', 0.06), ('iglesias', 0.06), ('isitsyn', 0.06), ('permissive', 0.06), ('samara', 0.06), ('toolboxes', 0.06), ('beygelzimer', 0.06), ('api', 0.06), ('matlab', 0.059), ('integration', 0.055), ('toolkit', 0.053), ('imension', 0.051), ('bsd', 0.051), ('toolkits', 0.051), ('externally', 0.051), ('engineered', 0.051), ('mnist', 0.049), ('software', 0.048), ('eigenmaps', 0.047), ('alignment', 0.047), ('ibrary', 0.046), ('eduction', 0.046), ('locality', 0.046), ('sergey', 0.046), ('passed', 0.045), ('silva', 0.045), ('preserving', 0.044), ('locally', 0.043), ('english', 0.043), ('fernando', 0.043), ('roweis', 0.04), ('documentation', 0.04), ('fficient', 0.04), ('coifman', 0.04), ('christian', 0.04), ('thread', 0.037), ('user', 0.037), ('java', 0.036), ('string', 0.036), ('combine', 0.035), ('python', 0.034), ('tenenbaum', 0.034), ('van', 0.034), ('biological', 0.033), ('structures', 0.032), ('diffusion', 0.031), ('modular', 0.031), ('languages', 0.031), ('proximity', 0.031), ('pca', 0.03), ('website', 0.03), ('design', 0.028), ('facilitates', 0.027), ('algebra', 0.027), ('source', 0.027), ('cover', 0.027), ('multidimensional', 0.026), ('projects', 0.026), ('kth', 0.026), ('belkin', 0.026), ('aerospace', 0.026), ('oating', 0.026), ('codebase', 0.026), ('laurens', 0.026), ('openmp', 0.026), ('compilation', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 112 jmlr-2013-Tapkee: An Efficient Dimension Reduction Library

Author: Sergey Lisitsyn, Christian Widmer, Fernando J. Iglesias Garcia

Abstract: We present Tapkee, a C++ template library that provides efficient implementations of more than 20 widely used dimensionality reduction techniques ranging from Locally Linear Embedding (Roweis and Saul, 2000) and Isomap (de Silva and Tenenbaum, 2002) to the recently introduced BarnesHut-SNE (van der Maaten, 2013). Our library was designed with a focus on performance and flexibility. For performance, we combine efficient multi-core algorithms, modern data structures and state-of-the-art low-level libraries. To achieve flexibility, we designed a clean interface for applying methods to user data and provide a callback API that facilitates integration with the library. The library is freely available as open-source software and is distributed under the permissive BSD 3-clause license. We encourage the integration of Tapkee into other open-source toolboxes and libraries. For example, Tapkee has been integrated into the codebase of the Shogun toolbox (Sonnenburg et al., 2010), giving us access to a rich set of kernels, distance measures and bindings to common programming languages including Python, Octave, Matlab, R, Java, C#, Ruby, Perl and Lua. Source code, examples and documentation are available at http://tapkee.lisitsyn.me. Keywords: dimensionality reduction, machine learning, C++, open source software

2 0.10630911 86 jmlr-2013-Parallel Vector Field Embedding

Author: Binbin Lin, Xiaofei He, Chiyuan Zhang, Ming Ji

Abstract: We propose a novel local isometry based dimensionality reduction method from the perspective of vector fields, which is called parallel vector field embedding (PFE). We first give a discussion on local isometry and global isometry to show the intrinsic connection between parallel vector fields and isometry. The problem of finding an isometry turns out to be equivalent to finding orthonormal parallel vector fields on the data manifold. Therefore, we first find orthonormal parallel vector fields by solving a variational problem on the manifold. Then each embedding function can be obtained by requiring its gradient field to be as close to the corresponding parallel vector field as possible. Theoretical results show that our method can precisely recover the manifold if it is isometric to a connected open subset of Euclidean space. Both synthetic and real data examples demonstrate the effectiveness of our method even if there is heavy noise and high curvature. Keywords: manifold learning, isometry, vector field, covariant derivative, out-of-sample extension

3 0.10218068 54 jmlr-2013-JKernelMachines: A Simple Framework for Kernel Machines

Author: David Picard, Nicolas Thome, Matthieu Cord

Abstract: JKernelMachines is a Java library for learning with kernels. It is primarily designed to deal with custom kernels that are not easily found in standard libraries, such as kernels on structured data. These types of kernels are often used in computer vision or bioinformatics applications. We provide several kernels leading to state of the art classification performances in computer vision, as well as various kernels on sets. The main focus of the library is to be easily extended with new kernels. Standard SVM optimization algorithms are available, but also more sophisticated learning-based kernel combination methods such as Multiple Kernel Learning (MKL), and a recently published algorithm to learn powered products of similarities (Product Kernel Learning). Keywords: classification, support vector machines, kernel, computer vision

4 0.074026249 59 jmlr-2013-Large-scale SVD and Manifold Learning

Author: Ameet Talwalkar, Sanjiv Kumar, Mehryar Mohri, Henry Rowley

Abstract: This paper examines the efficacy of sampling-based low-rank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nystr¨ m and Column sampling methods. We present a theoretical o comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the largescale task of extracting low-dimensional manifold structure given millions of high-dimensional face images. We address the computational challenges of non-linear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning low-dimensional embeddings for two large face data sets: CMU-PIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nystr¨ m approximation is superior to the Column sampling method for this o task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMU-PIE data set. Keywords: low-rank approximation, manifold learning, large-scale matrix factorization

5 0.069434039 34 jmlr-2013-Distance Preserving Embeddings for General n-Dimensional Manifolds

Author: Nakul Verma

Abstract: Low dimensional embeddings of manifold data have gained popularity in the last decade. However, a systematic finite sample analysis of manifold embedding algorithms largely eludes researchers. Here we present two algorithms that embed a general n-dimensional manifold into Rd (where d only depends on some key manifold properties such as its intrinsic dimension, volume and curvature) that guarantee to approximately preserve all interpoint geodesic distances. Keywords: manifold learning, isometric embeddings, non-linear dimensionality reduction, Nash’s embedding theorem

6 0.060635 67 jmlr-2013-MLPACK: A Scalable C++ Machine Learning Library

7 0.054803066 83 jmlr-2013-Orange: Data Mining Toolbox in Python

8 0.053004947 37 jmlr-2013-Divvy: Fast and Intuitive Exploratory Data Analysis

9 0.041550331 1 jmlr-2013-AC++Template-Based Reinforcement Learning Library: Fitting the Code to the Mathematics

10 0.040433981 77 jmlr-2013-On the Convergence of Maximum Variance Unfolding

11 0.039701268 96 jmlr-2013-Regularization-Free Principal Curve Estimation

12 0.039423574 46 jmlr-2013-GURLS: A Least Squares Library for Supervised Learning

13 0.037627924 109 jmlr-2013-Stress Functions for Nonlinear Dimension Reduction, Proximity Analysis, and Graph Drawing

14 0.035105638 19 jmlr-2013-BudgetedSVM: A Toolbox for Scalable SVM Approximations

15 0.032299403 113 jmlr-2013-The CAM Software for Nonnegative Blind Source Separation in R-Java

16 0.029740604 78 jmlr-2013-On the Learnability of Shuffle Ideals

17 0.023151606 69 jmlr-2013-Manifold Regularization and Semi-supervised Learning: Some Theoretical Analyses

18 0.018263478 70 jmlr-2013-Maximum Volume Clustering: A New Discriminative Clustering Approach

19 0.018195007 89 jmlr-2013-QuantMiner for Mining Quantitative Association Rules

20 0.017753344 52 jmlr-2013-How to Solve Classification and Regression Problems on High-Dimensional Data with a Supervised Extension of Slow Feature Analysis


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.106), (1, 0.072), (2, -0.04), (3, -0.219), (4, -0.055), (5, 0.065), (6, -0.048), (7, -0.033), (8, -0.072), (9, -0.095), (10, 0.05), (11, -0.241), (12, 0.071), (13, 0.008), (14, 0.085), (15, -0.262), (16, 0.075), (17, -0.148), (18, 0.121), (19, -0.048), (20, 0.208), (21, -0.02), (22, 0.064), (23, -0.062), (24, -0.069), (25, 0.098), (26, -0.062), (27, -0.061), (28, -0.059), (29, -0.104), (30, 0.039), (31, -0.058), (32, 0.007), (33, 0.006), (34, 0.015), (35, -0.072), (36, -0.014), (37, -0.039), (38, 0.042), (39, -0.012), (40, 0.027), (41, 0.017), (42, -0.045), (43, 0.072), (44, 0.028), (45, 0.007), (46, -0.053), (47, -0.0), (48, 0.035), (49, 0.009)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97877955 112 jmlr-2013-Tapkee: An Efficient Dimension Reduction Library

Author: Sergey Lisitsyn, Christian Widmer, Fernando J. Iglesias Garcia

Abstract: We present Tapkee, a C++ template library that provides efficient implementations of more than 20 widely used dimensionality reduction techniques ranging from Locally Linear Embedding (Roweis and Saul, 2000) and Isomap (de Silva and Tenenbaum, 2002) to the recently introduced BarnesHut-SNE (van der Maaten, 2013). Our library was designed with a focus on performance and flexibility. For performance, we combine efficient multi-core algorithms, modern data structures and state-of-the-art low-level libraries. To achieve flexibility, we designed a clean interface for applying methods to user data and provide a callback API that facilitates integration with the library. The library is freely available as open-source software and is distributed under the permissive BSD 3-clause license. We encourage the integration of Tapkee into other open-source toolboxes and libraries. For example, Tapkee has been integrated into the codebase of the Shogun toolbox (Sonnenburg et al., 2010), giving us access to a rich set of kernels, distance measures and bindings to common programming languages including Python, Octave, Matlab, R, Java, C#, Ruby, Perl and Lua. Source code, examples and documentation are available at http://tapkee.lisitsyn.me. Keywords: dimensionality reduction, machine learning, C++, open source software

2 0.66309279 67 jmlr-2013-MLPACK: A Scalable C++ Machine Learning Library

Author: Ryan R. Curtin, James R. Cline, N. P. Slagle, William B. March, Parikshit Ram, Nishant A. Mehta, Alexander G. Gray

Abstract: MLPACK is a state-of-the-art, scalable, multi-platform C++ machine learning library released in late 2011 offering both a simple, consistent API accessible to novice users and high performance and flexibility to expert users by leveraging modern features of C++. MLPACK provides cutting-edge algorithms whose benchmarks exhibit far better performance than other leading machine learning libraries. MLPACK version 1.0.3, licensed under the LGPL, is available at http://www.mlpack.org. Keywords: C++, dual-tree algorithms, machine learning software, open source software, largescale learning 1. Introduction and Goals Though several machine learning libraries are freely available online, few, if any, offer efficient algorithms to the average user. For instance, the popular Weka toolkit (Hall et al., 2009) emphasizes ease of use but scales poorly; the distributed Apache Mahout library offers scalability at a cost of higher overhead (such as clusters and powerful servers often unavailable to the average user). Also, few libraries offer breadth; for instance, libsvm (Chang and Lin, 2011) and the Tilburg MemoryBased Learner (TiMBL) are highly scalable and accessible yet each offer only a single method. MLPACK, intended to be the machine learning analog to the general-purpose LAPACK linear algebra library, aims to combine efficiency and accessibility. Written in C++, MLPACK uses the highly efficient Armadillo matrix library (Sanderson, 2010) and is freely available under the GNU Lesser General Public License (LGPL). Through the use of C++ templates, MLPACK both eliminates unnecessary copying of data sets and performs expression optimizations unavailable in other languages. Also, MLPACK is, to our knowledge, unique among existing libraries in using generic programming features of C++ to allow customization of the available machine learning methods without incurring performance penalties. c 2013 Ryan R. Curtin, James R. Cline, N. P. Slagle, William B. March, Parikshit Ram, Nishant A. Mehta and Alexander G. Gray. C URTIN , C LINE , S LAGLE , M ARCH , R AM , M EHTA AND G RAY In addition, users ranging from students to experts should find the consistent, intuitive interface of MLPACK to be highly accessible. Finally, the source code provides references and comprehensive documentation. Four major goals of the development team of MLPACK are • • • • to implement scalable, fast machine learning algorithms, to design an intuitive, consistent, and simple API for non-expert users, to implement a variety of machine learning methods, and to provide cutting-edge machine learning algorithms unavailable elsewhere. This paper offers both an introduction to the simple and extensible API and a glimpse of the superior performance of the library. 2. Package Overview Each algorithm available in MLPACK features both a set of C++ library functions and a standalone command-line executable. Version 1.0.3 includes the following methods: • • • • • • • • • • • • • • nearest/furthest neighbor search with cover trees or kd-trees (k-nearest-neighbors) range search with cover trees or kd-trees Gaussian mixture models (GMMs) hidden Markov models (HMMs) LARS / Lasso regression k-means clustering fast hierarchical clustering (Euclidean MST calculation)1 (March et al., 2010) kernel PCA (and regular PCA) local coordinate coding1 (Yu et al., 2009) sparse coding using dictionary learning RADICAL (Robust, Accurate, Direct ICA aLgorithm) (Learned-Miller and Fisher, 2003) maximum variance unfolding (MVU) via LRSDP1 (Burer and Monteiro, 2003) the naive Bayes classifier density estimation trees1 (Ram and Gray, 2011) The development team manages MLPACK with Subversion and the Trac bug reporting system, allowing easy downloads and simple bug reporting. The entire development process is transparent, so any interested user can easily contribute to the library. MLPACK can compile from source on Linux, Mac OS, and Windows; currently, different Linux distributions are reviewing MLPACK for inclusion in their package managers, which will allow users to install MLPACK without needing to compile from source. 3. A Consistent, Simple API MLPACK features a highly accessible API, both in style (such as consistent naming schemes and coding conventions) and ease of use (such as templated defaults), as well as stringent documentation standards. Consequently, a new user can execute algorithms out-of-the-box often with little or no adjustment to parameters, while the seasoned expert can expect extreme flexibility in algorithmic 1. This algorithm is not available in any other comparable software package. 802 MLPACK: A S CALABLE C++ M ACHINE L EARNING L IBRARY Data Set wine cloud wine-qual isolet miniboone yp-msd corel covtype mnist randu MLPACK 0.0003 0.0069 0.0290 13.0197 20.2045 5430.0478 4.9716 14.3449 2719.8087 1020.9142 Weka 0.0621 0.1174 0.8868 213.4735 216.1469 >9000.0000 14.4264 45.9912 >9000.0000 2665.0921 Shogun 0.0277 0.5000 4.3617 37.6190 2351.4637 >9000.0000 555.9600 >9000.0000 3536.4477 >9000.0000 MATLAB 0.0021 0.0210 0.6465 46.9518 1088.1127 >9000.0000 60.8496 >9000.0000 4838.6747 1679.2893 mlpy 0.0025 0.3520 4.0431 52.0437 3219.2696 >9000.0000 209.5056 >9000.0000 5192.3586 >9000.0000 sklearn 0.0008 0.0192 0.1668 46.8016 714.2385 >9000.0000 160.4597 651.6259 5363.9650 8780.0176 Table 1: k-NN benchmarks (in seconds). Data Set UCI Name Size wine Wine 178x13 cloud Cloud 2048x10 wine-qual Wine Quality 6497x11 isolet ISOLET 7797x617 miniboone MiniBooNE 130064x50 Data Set UCI Name Size yp-msd YearPredictionMSD 515345x90 corel Corel 37749x32 covtype Covertype 581082x54 mnist N/A 70000x784 randu N/A 1000000x10 Table 2: Benchmark data set sizes. tuning. For example, the following line initializes an object which will perform the standard kmeans clustering in Euclidean space: KMeans

3 0.51381552 54 jmlr-2013-JKernelMachines: A Simple Framework for Kernel Machines

Author: David Picard, Nicolas Thome, Matthieu Cord

Abstract: JKernelMachines is a Java library for learning with kernels. It is primarily designed to deal with custom kernels that are not easily found in standard libraries, such as kernels on structured data. These types of kernels are often used in computer vision or bioinformatics applications. We provide several kernels leading to state of the art classification performances in computer vision, as well as various kernels on sets. The main focus of the library is to be easily extended with new kernels. Standard SVM optimization algorithms are available, but also more sophisticated learning-based kernel combination methods such as Multiple Kernel Learning (MKL), and a recently published algorithm to learn powered products of similarities (Product Kernel Learning). Keywords: classification, support vector machines, kernel, computer vision

4 0.50593179 37 jmlr-2013-Divvy: Fast and Intuitive Exploratory Data Analysis

Author: Joshua M. Lewis, Virginia R. de Sa, Laurens van der Maaten

Abstract: Divvy is an application for applying unsupervised machine learning techniques (clustering and dimensionality reduction) to the data analysis process. Divvy provides a novel UI that allows researchers to tighten the action-perception loop of changing algorithm parameters and seeing a visualization of the result. Machine learning researchers can use Divvy to publish easy to use reference implementations of their algorithms, which helps the machine learning field have a greater impact on research practices elsewhere. Keywords: clustering, dimensionality reduction, open source software, human computer interaction, data visualization

5 0.48813286 1 jmlr-2013-AC++Template-Based Reinforcement Learning Library: Fitting the Code to the Mathematics

Author: Hervé Frezza-Buet, Matthieu Geist

Abstract: This paper introduces the rllib as an original C++ template-based library oriented toward value function estimation. Generic programming is promoted here as a way of having a good fit between the mathematics of reinforcement learning and their implementation in a library. The main concepts of rllib are presented, as well as a short example. Keywords: reinforcement learning, C++, generic programming 1. C++ Genericity for Fitting the Mathematics of Reinforcement Learning Reinforcement learning (RL) is a field of machine learning that benefits from a rigorous mathematical formalism, as shown for example by Bertsekas (1995). Although this formalism is well accepted in the field, its translation into efficient computer science tools has surprisingly not led to any standard yet, as mentioned by Kovacs and Egginton (2011). The claim of this paper is that genericity enables a natural expression of the mathematics of RL. The rllib (2011) library implements this idea in the C++ language, where genericity relies on templates. Templates automate the re-writing of some generic code involving user types, offering a strong type checking at compile time that improves the code safety. Using the rllib templates requires that the user-defined types fit some documented concepts. For example, some class C defining an agent should be designed so that C::state type is the type used for the states, C::action type is the type used for the actions, and the method C::action type policy(const C::state type& s) const is implemented, in order to compute the action to be performed in a given state. This concept definition specifies what is required for an agent mathematically. Note that C does not need to inherit from any kind of abstract rl::Agent class to be used by the rllib tools. It can be directly provided as a type argument to any rllib template requiring an argument fitting the concept of an agent, so that the re-written code actually compiles. 2. A Short Example Let us consider the following toy-example. The state space contains values from 0 to 9, and actions consist in increasing or decreasing the value. When the value gets out of bounds, a reward is returned ∗. Also at UMI 2958 Georgia Tech / CNRS, 2-3, rue Marconi, 57070 Metz, France. c 2013 Herv´ Frezza-Buet and Matthieu Geist. e F REZZA -B UET AND G EIST (-1 for bound 0, 1 for bound 9). Otherwise, a null reward is returned. Let us define this problem and run Sarsa. First, a simulator class fitting the concept Simulator described in the documentation is needed. c l a s s Sim { / / Our s i m u l a t o r c l a s s . No i n h e r i t a n c e r e q u i r e d . private : i n t c u r r e n t ; double r ; public : typedef int phase type ; typedef int observation type ; t y p e d e f enum { up , down} a c t i o n t y p e ; t y p e d e f d o u b l e r e w a r d t y p e ; Sim ( v o i d ) : c u r r e n t ( 0 ) , r ( 0 ) {} v o i d s e t P h a s e ( c o n s t p h a s e t y p e &s; ) { c u r r e n t = s %10;} c o n s t o b s e r v a t i o n t y p e& s e n s e ( v o i d ) c o n s t { r e t u r n c u r r e n t ; } reward type reward ( void ) const { return r ;} v o i d t i m e S t e p ( c o n s t a c t i o n t y p e &a; ) { i f ( a == up ) c u r r e n t ++; e l s e c u r r e n t −−; i f ( c u r r e n t < 0 ) r =−1; e l s e i f ( c u r r e n t > 9 ) r = 1 ; e l s e r = 0 ; i f ( r ! = 0 ) throw r l : : e x c e p t i o n : : T e r m i n a l ( ” Out o f r a n g e ” ) ; } }; Following the concept requirements, the class Sim naturally implements a sensor method sense that provides an observation from the current phase of the controlled dynamical system, and a method timeStep that computes a transition consecutive to some action. Note the use of exceptions for terminal states. For the sake of simplicity in further code, the following is added. typedef typedef typedef typedef typedef Sim : : p h a s e t y p e Sim : : a c t i o n t y p e r l : : I t e r a t o r t y p e d e f r l : : a g e n t : : o n l i n e : : E p s i l o n G r e e d y Critic ; ArgmaxCritic ; TestAgent ; LearnAgent ; The rllib expresses that Sarsa provides a critic, offering a Q-function. As actions are discrete, the best action (i.e., argmaxa∈A Q(s, a)) can be found by considering all the actions sequentially. This is what ArgmaxCritic offers thanks to the action enumerator Aenum, in order to define greedy and ε-greedy agents. The main function then only consists in running episodes with the appropriate agents. i n t main ( i n t a r g c , char ∗ a r g v [ ] ) { Sim simulator ; Transition transition ; ArgmaxCritic c r i t i c ; LearnAgent learner ( critic ); TestAgent tester ( critic ); A a; S s; int episode , length , s t e p =0; // // // // // // // T h i s i s what t h e a g e n t c o n t r o l s . T h i s i s some s , a , r , s ’ , a ’ d a t a . T h i s c o m p u t e s Q and argmax a Q( s , a ) . SARSA u s e s t h i s a g e n t t o l e a r n t h e p o l i c y . This behaves according to the c r i t i c . Some a c t i o n . Some s t a t e . f o r ( e p i s o d e = 0 ; e p i s o d e < 1 0 0 0 0 ; ++ e p i s o d e ) { / / L e a r n i n g p h a s e s i m u l a t o r . setPhase ( rand ()%10); r l : : episode : : sa : : r u n a n d l e a r n ( simulator , l e a r n e r , t r a n s i t i o n , 0 , length ) ; } try { / / T e s t phase simulator . setPhase (0); while ( true ) { s = simulator . sense ( ) ; a = t e s t e r . policy ( s ) ; s t e p ++; s i m u l a t o r . t i m e S t e p ( a ) ; } } c a t c h ( r l : : e x c e p t i o n : : T e r m i n a l e ) { s t d : : c o u t << s t e p << ” s t e p s . ” << s t d : : e n d l ; } return 0 ; / / t h e message p r i n t e d i s ‘ ‘10 s t e p s . ’ ’ } 3. Features of the Library Using the library requires to define the features that are specific to the problem (the simulator and the Q-function architecture in our example) from scratch, but with the help of concepts. Then, the specific features can be handled by generic code provided by the library to implement RL techniques with value function estimation. 627 F REZZA -B UET AND G EIST Currently, Q-learing, Sarsa, KTD-Q, LSTD, and policy iteration are available, as well as a multi-layer perceptron architecture. Moreover, some benchmark problems (i.e., simulators) are also provided: the mountain car, the cliff walking, the inverted pendulum and the Boyan chain. Extending the library with new algorithms is allowed, since it consists in defining new templates. This is a bit more technical than only using the existing algorithms, but the structure of existing concepts helps, since it reflects the mathematics of RL. For example, concepts like Feature, for linear approaches mainly (i.e., Q(s, a) = θT ϕ(s, a)) and Architecture (i.e., Q(s, a) = fθ (s, a) for more general approximation) orient the design toward functional approaches of RL. The algorithms implemented so far rely on the GNU Scientific Library (see GSL, 2011) for linear algebra computation, so the GPL licence of GSL propagates to the rllib. 4. Conclusion The rllib relies only on the C++ standard and the availability of the GSL on the system. It offers state-action function approximation tools for applying RL to real problems, as well as a design that fits the mathematics. The latter allows for extensions, but is also compliant with pedagogical purpose. The design of the rllib aims at allowing the user to build (using C++ programming) its own experiment, using several algorithms, several agents, on-line or batch learning, and so on. Actually, the difficult part of RL is the algorithms themselves, not the script-like part of the experiment where things are put together (see the main function in our example). With a framework, in the sense of Kovacs and Egginton (2011), the experiment is not directly accessible to the user programs, since it is handled by some libraries in order to offer graphical interface or analyzing tools. The user code is then called by the framework when required. We advocate that allowing the user to call the rllib functionality at his/her convenience provides an open and extensible access to RL for students, researchers and engineers. Last, the rllib fits the requirements expressed by Kovacs and Egginton (2011, Section 4.3): support of good scientific research, formulation compliant with the domain, allowing for any kind of agents and any kind of approximators, interoperability of components (the Q function of the example can be used for different algorithms and agents), maximization of run-time speed (use of C++ and templates that inline massively the code), open source, etc. Extensions of rllib can be considered, for example for handling POMDPs, and contributions of users are expected. The use of templates is unfortunately unfamiliar to many programmers, but the effort is worth it, since it brings the code at the level of the mathematical formalism, increasing readability (by a rational use of typedefs) and reducing bugs. Even if the approach is dramatically different from existing frameworks, wrappings with frameworks can be considered in further development. References Dimitri P. Bertsekas. Dynamic Programming and Optimal Control. Athena Scientific, 3rd (20052007) edition, 1995. GSL, 2011. http://http://www.gnu.org/software/gsl. Tim Kovacs and Robert Egginton. On the analysis and design of software for reinforcement learning, with a survey of existing systems. Machine Learning, 84:7–49, 2011. rllib, 2011. http://ims.metz.supelec.fr/spip.php?article122. 628

6 0.46173179 83 jmlr-2013-Orange: Data Mining Toolbox in Python

7 0.40962473 46 jmlr-2013-GURLS: A Least Squares Library for Supervised Learning

8 0.38621563 86 jmlr-2013-Parallel Vector Field Embedding

9 0.32741165 113 jmlr-2013-The CAM Software for Nonnegative Blind Source Separation in R-Java

10 0.31892708 109 jmlr-2013-Stress Functions for Nonlinear Dimension Reduction, Proximity Analysis, and Graph Drawing

11 0.27902314 34 jmlr-2013-Distance Preserving Embeddings for General n-Dimensional Manifolds

12 0.23956317 19 jmlr-2013-BudgetedSVM: A Toolbox for Scalable SVM Approximations

13 0.2299877 77 jmlr-2013-On the Convergence of Maximum Variance Unfolding

14 0.22451752 59 jmlr-2013-Large-scale SVD and Manifold Learning

15 0.1555948 33 jmlr-2013-Dimension Independent Similarity Computation

16 0.13412265 22 jmlr-2013-Classifying With Confidence From Incomplete Information

17 0.1234505 78 jmlr-2013-On the Learnability of Shuffle Ideals

18 0.11859128 25 jmlr-2013-Communication-Efficient Algorithms for Statistical Optimization

19 0.10964051 38 jmlr-2013-Dynamic Affine-Invariant Shape-Appearance Handshape Features and Classification in Sign Language Videos

20 0.10743703 7 jmlr-2013-A Risk Comparison of Ordinary Least Squares vs Ridge Regression


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.017), (5, 0.064), (6, 0.02), (10, 0.05), (20, 0.023), (23, 0.015), (44, 0.025), (68, 0.01), (75, 0.021), (87, 0.646)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.93273556 7 jmlr-2013-A Risk Comparison of Ordinary Least Squares vs Ridge Regression

Author: Paramveer S. Dhillon, Dean P. Foster, Sham M. Kakade, Lyle H. Ungar

Abstract: We compare the risk of ridge regression to a simple variant of ordinary least squares, in which one simply projects the data onto a finite dimensional subspace (as specified by a principal component analysis) and then performs an ordinary (un-regularized) least squares regression in this subspace. This note shows that the risk of this ordinary least squares method (PCA-OLS) is within a constant factor (namely 4) of the risk of ridge regression (RR). Keywords: risk inflation, ridge regression, pca

same-paper 2 0.9287613 112 jmlr-2013-Tapkee: An Efficient Dimension Reduction Library

Author: Sergey Lisitsyn, Christian Widmer, Fernando J. Iglesias Garcia

Abstract: We present Tapkee, a C++ template library that provides efficient implementations of more than 20 widely used dimensionality reduction techniques ranging from Locally Linear Embedding (Roweis and Saul, 2000) and Isomap (de Silva and Tenenbaum, 2002) to the recently introduced BarnesHut-SNE (van der Maaten, 2013). Our library was designed with a focus on performance and flexibility. For performance, we combine efficient multi-core algorithms, modern data structures and state-of-the-art low-level libraries. To achieve flexibility, we designed a clean interface for applying methods to user data and provide a callback API that facilitates integration with the library. The library is freely available as open-source software and is distributed under the permissive BSD 3-clause license. We encourage the integration of Tapkee into other open-source toolboxes and libraries. For example, Tapkee has been integrated into the codebase of the Shogun toolbox (Sonnenburg et al., 2010), giving us access to a rich set of kernels, distance measures and bindings to common programming languages including Python, Octave, Matlab, R, Java, C#, Ruby, Perl and Lua. Source code, examples and documentation are available at http://tapkee.lisitsyn.me. Keywords: dimensionality reduction, machine learning, C++, open source software

3 0.71468639 116 jmlr-2013-Truncated Power Method for Sparse Eigenvalue Problems

Author: Xiao-Tong Yuan, Tong Zhang

Abstract: This paper considers the sparse eigenvalue problem, which is to extract dominant (largest) sparse eigenvectors with at most k non-zero components. We propose a simple yet effective solution called truncated power method that can approximately solve the underlying nonconvex optimization problem. A strong sparse recovery result is proved for the truncated power method, and this theory is our key motivation for developing the new algorithm. The proposed method is tested on applications such as sparse principal component analysis and the densest k-subgraph problem. Extensive experiments on several synthetic and real-world data sets demonstrate the competitive empirical performance of our method. Keywords: sparse eigenvalue, power method, sparse principal component analysis, densest k-subgraph

4 0.36733878 37 jmlr-2013-Divvy: Fast and Intuitive Exploratory Data Analysis

Author: Joshua M. Lewis, Virginia R. de Sa, Laurens van der Maaten

Abstract: Divvy is an application for applying unsupervised machine learning techniques (clustering and dimensionality reduction) to the data analysis process. Divvy provides a novel UI that allows researchers to tighten the action-perception loop of changing algorithm parameters and seeing a visualization of the result. Machine learning researchers can use Divvy to publish easy to use reference implementations of their algorithms, which helps the machine learning field have a greater impact on research practices elsewhere. Keywords: clustering, dimensionality reduction, open source software, human computer interaction, data visualization

5 0.35015109 46 jmlr-2013-GURLS: A Least Squares Library for Supervised Learning

Author: Andrea Tacchetti, Pavan K. Mallapragada, Matteo Santoro, Lorenzo Rosasco

Abstract: We present GURLS, a least squares, modular, easy-to-extend software library for efficient supervised learning. GURLS is targeted to machine learning practitioners, as well as non-specialists. It offers a number state-of-the-art training strategies for medium and large-scale learning, and routines for efficient model selection. The library is particularly well suited for multi-output problems (multi-category/multi-label). GURLS is currently available in two independent implementations: Matlab and C++. It takes advantage of the favorable properties of regularized least squares algorithm to exploit advanced tools in linear algebra. Routines to handle computations with very large matrices by means of memory-mapped storage and distributed task execution are available. The package is distributed under the BSD license and is available for download at https://github.com/LCSL/GURLS. Keywords: regularized least squares, big data, linear algebra 1. Introduction and Design Supervised learning has become a fundamental tool for the design of intelligent systems and the analysis of high dimensional data. Key to this success has been the availability of efficient, easy-touse software packages. New data collection technologies make it easy to gather high dimensional, multi-output data sets of increasing size. This trend calls for new software solutions for the automatic training, tuning and testing of supervised learning methods. These observations motivated the design of GURLS (Grand Unified Regularized Least Squares). The package was developed to pursue the following goals: Speed: Fast training/testing procedures for learning problems with potentially large/huge number of points, features and especially outputs (e.g., classes). Memory: Flexible data management to work with large data sets by means of memory-mapped storage. Performance: ∗. Also in the Laboratory for Computational and Statistical Learning, Istituto Italiano di Tecnologia and Massachusetts Institute of Technology c 2013 Andrea Tacchetti, Pavan K. Mallapragada, Matteo Santoro and Lorenzo Rosasco. TACCHETTI , M ALLAPRAGADA , S ANTORO AND ROSASCO State of the art results in high-dimensional multi-output problems. Usability and modularity: Easy to use and to expand. GURLS is based on Regularized Least Squares (RLS) and takes advantage of all the favorable properties of these methods (Rifkin et al., 2003). Since the algorithm reduces to solving a linear system, GURLS is set up to exploit the powerful tools, and recent advances, of linear algebra (including randomized solver, first order methods, etc.). Second, it makes use of RLS properties which are particularly suited for high dimensional learning. For example: (1) RLS has natural primal and dual formulation (hence having complexity which is the smallest between number of examples and features); (2) efficient parameter selection (closed form expression of the leave one out error and efficient computations of regularization path); (3) natural and efficient extension to multiple outputs. Specific attention has been devoted to handle large high dimensional data sets. We rely on data structures that can be serialized using memory-mapped files, and on a distributed task manager to perform a number of key steps (such as matrix multiplication) without loading the whole data set in memory. Efforts were devoted to to provide a lean API and an exhaustive documentation. GURLS has been deployed and tested successfully on Linux, MacOS and Windows. The library is distributed under the simplified BSD license, and can be downloaded from https://github.com/LCSL/GURLS. 2. Description of the Library The library comprises four main modules. GURLS and bGURLS—both implemented in Matlab— are aimed at solving learning problems with small/medium and large-scale data sets respectively. GURLS++ and bGURLS++ are their C++ counterparts. The Matlab and C++ versions share the same design, but the C++ modules have significant improvements, which make them faster and more flexible. The specification of the desired machine learning experiment in the library is straightforward. Basically, it is a formal description of a pipeline, that is, an ordered sequence of steps. Each step identifies an actual learning task, and belongs to a predefined category. The core of the library is a method (a class in the C++ implementation) called GURLScore, which is responsible for processing the sequence of tasks in the proper order and for linking the output of the former task to the input of the subsequent one. A key role is played by the additional “options” structure, referred to as OPT. OPT is used to store all configuration parameters required to customize the behavior of individual tasks in the pipeline. Tasks receive configuration parameters from OPT in read-only mode and—upon termination—the results are appended to the structure by GURLScore in order to make them available to subsequent tasks. This allows the user to skip the execution of some tasks in a pipeline, by simply inserting the desired results directly into the options structure. Currently, we identify six different task categories: data set splitting, kernel computation, model selection, training, evaluation and testing and performance assessment and analysis. Tasks belonging to the same category may be interchanged with each other. 2.1 Learning From Large Data Sets Two modules in GURLS have been specifically designed to deal with big data scenarios. The approach we adopted is mainly based on a memory-mapped abstraction of matrix and vector data structures, and on a distributed computation of a number of standard problems in linear algebra. For learning on big data, we decided to focus specifically on those situations where one seeks a linear model on a large set of (possibly non linear) features. A more accurate specification of what “large” means in GURLS is related to the number of features d and the number of training 3202 GURLS: A L EAST S QUARES L IBRARY FOR S UPERVISED L EARNING data set optdigit landast pendigit letter isolet # of samples 3800 4400 7400 10000 6200 # of classes 10 6 10 26 26 # of variables 64 36 16 16 600 Table 1: Data sets description. examples n: we require it must be possible to store a min(d, n) × min(d, n) matrix in memory. In practice, this roughly means we can train models with up-to 25k features on machines with 8Gb of RAM, and up-to 50k features on machines with 36Gb of RAM. We do not require the data matrix itself to be stored in memory: within GURLS it is possible to manage an arbitrarily large set of training examples. We distinguish two different scenarios. Data sets that can fully reside in RAM without any memory mapping techniques—such as swapping—are considered to be small/medium. Larger data sets are considered to be “big” and learning must be performed using either bGURLS or bGURLS++ . These two modules include all the design patterns described above, and have been complemented with additional big data and distributed computation capabilities. Big data support is obtained using a data structure called bigarray, which allows to handle data matrices as large as the space available on the hard drive: we store the entire data set on disk and load only small chunks in memory when required. There are some differences between the Matlab and C++ implementations. bGURLS relies on a simple, ad hoc interface, called GURLS Distributed Manager (GDM), to distribute matrix-matrix multiplications, thus allowing users to perform the important task of kernel matrix computation on a distributed network of computing nodes. After this step, the subsequent tasks behave as in GURLS. bGURLS++ (currently in active development) offers more interesting features because it is based on the MPI libraries. Therefore, it allows for a full distribution within every single task of the pipeline. All the processes read the input data from a shared filesystem over the network and then start executing the same pipeline. During execution, each process’ task communicates with the corresponding ones. Every process maintains its local copy of the options. Once the same task is completed by all processes, the local copies of the options are synchronized. This architecture allows for the creation of hybrid pipelines comprising serial one-process-based tasks from GURLS++ . 3. Experiments We decided to focus the experimental analysis in the paper to the assessment of GURLS’ performance both in terms of accuracy and time. In our experiments we considered 5 popular data sets, briefly described in Table 1. Experiments were run on a Intel Xeon 5140 @ 2.33GHz processor with 8GB of RAM, and running Ubuntu 8.10 Server (64 bit). optdigit accuracy (%) GURLS (linear primal) GURLS (linear dual) LS-SVM linear GURLS (500 random features) GURLS (1000 random features) GURLS (Gaussian kernel) LS-SVM (Gaussian kernel) time (s) landsat accuracy (%) time (s) pendigit accuracy (%) time (s) 92.3 92.3 92.3 96.8 97.5 98.3 98.3 0.49 726 7190 25.6 207 13500 26100 63.68 66.3 64.6 63.5 63.5 90.4 90.51 0.22 1148 6526 28.0 187 20796 18430 82.24 82.46 82.3 96.7 95.8 98.4 98.36 0.23 5590 46240 31.6 199 100600 120170 Table 2: Comparison between GURLS and LS-SVM. 3203 TACCHETTI , M ALLAPRAGADA , S ANTORO AND ROSASCO Performance (%) 1 0.95 0.9 0.85 isolet(∇) letter(×) 0.8 pendigit(∆) 0.75 landsat(♦) optdigit(◦) 0.7 LIBSVM:rbf 0.65 GURLS++:rbf GURLS:randomfeatures-1000 0.6 GURLS:randomfeatures-500 0.55 0.5 0 10 GURLS:rbf 1 10 2 10 3 10 4 Time (s) 10 Figure 1: Prediction accuracy vs. computing time. The color represents the training method and the library used. In blue: the Matlab implementation of RLS with RBF kernel, in red: its C++ counterpart. In dark red: results of LIBSVM with RBF kernel. In yellow and green: results obtained using a linear kernel on 500 and 1000 random features respectively. We set up different pipelines and compared the performance to SVM, for which we used the python modular interface to LIBSVM (Chang and Lin, 2011). Automatic selection of the optimal regularization parameter is implemented identically in all experiments: (i) split the data; (ii) define a set of regularization parameter on a regular grid; (iii) perform hold-out validation. The variance of the Gaussian kernel has been fixed by looking at the statistics of the pairwise distances among training examples. The prediction accuracy of GURLS and GURLS++ is identical—as expected—but the implementation in C++ is significantly faster. The prediction accuracy of standard RLS-based methods is in many cases higher than SVM. Exploiting the primal formulation of RLS, we further ran experiments with the random features approximation (Rahimi and Recht, 2008). As show in Figure 1, the performance of this method is comparable to that of SVM at a much lower computational cost in the majority of the tested data sets. We further compared GURLS with another available least squares based toolbox: the LS-SVM toolbox (Suykens et al., 2001), which includes routines for parameter selection such as coupled simulated annealing and line/grid search. The goal of this experiment is to benchmark the performance of the parameter selection with random data splitting included in GURLS. For a fair comparison, we considered only the Matlab implementation of GURLS. Results are reported in Table 2. As expected, using the linear kernel with the primal formulation—not available in LS-SVM—is the fastest approach since it leverages the lower dimensionality of the input space. When the Gaussian kernel is used, GURLS and LS-SVM have comparable computing time and classification performance. Note, however, that in GURLS the number of parameter in the grid search is fixed to 400, while in LS-SVM it may vary and is limited to 70. The interesting results obtained with the random features implementation in GURLS, make it an interesting choice in many applications. Finally, all GURLS pipelines, in their Matlab implementation, are faster than LS-SVM and further improvements can be achieved with GURLS++ . Acknowledgments We thank Tomaso Poggio, Zak Stone, Nicolas Pinto, Hristo S. Paskov and CBCL for comments and insights. 3204 GURLS: A L EAST S QUARES L IBRARY FOR S UPERVISED L EARNING References C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011. Software available at http://www. csie.ntu.edu.tw/˜cjlin/libsvm. A. Rahimi and B. Recht. Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In Advances in Neural Information Processing Systems, volume 21, pages 1313–1320, 2008. R. Rifkin, G. Yeo, and T. Poggio. Regularized least-squares classification. Nato Science Series Sub Series III Computer and Systems Sciences, 190:131–154, 2003. J. Suykens, T. V. Gestel, J. D. Brabanter, B. D. Moor, and J. Vandewalle. Least Squares Support Vector Machines. World Scientific, 2001. ISBN 981-238-151-1. 3205

6 0.31164974 59 jmlr-2013-Large-scale SVD and Manifold Learning

7 0.30669326 105 jmlr-2013-Sparsity Regret Bounds for Individual Sequences in Online Linear Regression

8 0.2931557 74 jmlr-2013-Multivariate Convex Regression with Adaptive Partitioning

9 0.27696058 31 jmlr-2013-Derivative Estimation with Local Polynomial Fitting

10 0.27480364 50 jmlr-2013-Greedy Feature Selection for Subspace Clustering

11 0.27194542 86 jmlr-2013-Parallel Vector Field Embedding

12 0.26159719 52 jmlr-2013-How to Solve Classification and Regression Problems on High-Dimensional Data with a Supervised Extension of Slow Feature Analysis

13 0.2603541 53 jmlr-2013-Improving CUR Matrix Decomposition and the Nystrom Approximation via Adaptive Sampling

14 0.25183129 102 jmlr-2013-Sparse Matrix Inversion with Scaled Lasso

15 0.24829575 5 jmlr-2013-A Near-Optimal Algorithm for Differentially-Private Principal Components

16 0.24655898 27 jmlr-2013-Consistent Selection of Tuning Parameters via Variable Selection Stability

17 0.23329198 1 jmlr-2013-AC++Template-Based Reinforcement Learning Library: Fitting the Code to the Mathematics

18 0.22866964 25 jmlr-2013-Communication-Efficient Algorithms for Statistical Optimization

19 0.22839585 19 jmlr-2013-BudgetedSVM: A Toolbox for Scalable SVM Approximations

20 0.22480139 83 jmlr-2013-Orange: Data Mining Toolbox in Python