jmlr jmlr2012 jmlr2012-113 knowledge-graph by maker-knowledge-mining

113 jmlr-2012-The huge Package for High-dimensional Undirected Graph Estimation in R


Source: pdf

Author: Tuo Zhao, Han Liu, Kathryn Roeder, John Lafferty, Larry Wasserman

Abstract: We describe an R package named huge which provides easy-to-use functions for estimating high dimensional undirected graphs from data. This package implements recent results in the literature, including Friedman et al. (2007), Liu et al. (2009, 2012) and Liu et al. (2010). Compared with the existing graph estimation package glasso, the huge package provides extra features: (1) instead of using Fortan, it is written in C, which makes the code more portable and easier to modify; (2) besides fitting Gaussian graphical models, it also provides functions for fitting high dimensional semiparametric Gaussian copula models; (3) more functions like data-dependent model selection, data generation and graph visualization; (4) a minor convergence problem of the graphical lasso algorithm is corrected; (5) the package allows the user to apply both lossless and lossy screening rules to scale up large-scale problems, making a tradeoff between computational and statistical efficiency. Keywords: high-dimensional undirected graph estimation, glasso, huge, semiparametric graph estimation, data-dependent model selection, lossless screening, lossy screening 1. Overview Undirected graphs is a natural approach to describe the conditional independence among many variables. Each node of the graph represents a single variable and no edge between two variables implies that they are conditional independent given all other variables. In the past decade, significant progress has been made on designing efficient algorithms to learn undirected graphs from high-dimensional observational data sets. Most of these methods are based on either the penalized maximum-likelihood estimation (Friedman et al., 2007) or penalized regression methods (Meinshausen and B¨ hlmann, 2006). Existing packages include glasso, Covpath and CLIME. In particuu ∗. Also in the Department of Biostatistics. †. Also in the Department of Machine Learning. c 2012 Zhao, Liu, Roeder, Lafferty and Wasserman. Z HAO , L IU , ROEDER , L AFFERTY AND WASSERMAN lar, the glasso package has been widely adopted by statisticians and computer scientists due to its friendly user-inference and efficiency. In this paper1 we describe a newly developed R package named huge (High-dimensional Undirected Graph Estimation) coded in C. The package includes a wide range of functional modules and addresses some drawbacks of the graphical lasso algorithm. To gain more scalability, the package supports two modes of screening, lossless (Witten et al., 2011) and lossy screening. When using lossy screening, the user can select the desired screening level to scale up for high-dimensional problems, but this introduces some estimation bias. 2. Software Design and Implementation The package huge aims to provide a general framework for high-dimensional undirected graph estimation. The package includes Six functional modules (M1-M6) facilitate a flexible pipeline for analysis (Figure 1). M1. Data Generator: The function huge.generator() can simulate multivariate Gaussian data with different undirected graphs, including hub, cluster, band, scale-free, and Erd¨ s-R´ nyi o e random graphs. The sparsity level of the obtained graph and signal-to-noise ratio can also be set up by users. M2. Semiparametric Transformation: The function huge.npn() implements the nonparanormal method (Liu et al., 2009, 2012) for estimating a semiparametric Gaussian copula model.The nonparanormal family extends the Gaussian distribution by marginally transforming the variables. Computationally, the nonparanormal transformation only requires one pass through the data matrix. M3. Graph Screening: The scr argument in the main function huge() controls the use of largescale correlation screening before graph estimation. The function supports the lossless screening (Witten et al., 2011) and the lossy screening. Such screening procedures can greatly reduce the computational cost and achieve equal or even better estimation by reducing the variance at the expense of increased bias. Figure 1: The graph estimation pipeline. M4. Graph Estimation: Similar to the glasso package, the method argument in the huge() function supports two estimation methods: (i) the neighborhood pursuit algorithm (Meinshausen and B¨ hlmann, 2006) and (ii) the graphical lasso algorithm (Friedman et al., 2007). We apply u the coordinate descent with active set and covariance update, as well as other tricks suggested in Friedman et al. (2010). We modified the warm start trick to address the potential divergence problem of the graphical lasso algorithm (Mazumder and Hastie, 2011). The code is also memory-optimized using the sparse matrix data structure when estimating and storing full regularization paths for large 1. This paper is only a summary of the package huge. For more details please refer to the online vignette. 1060 H IGH - DIMENSIONAL U NDIRECTED G RAPH E STIMATION data sets. we also provide a complementary graph estimation method based on thresholding the sample correlation matrix, which is computationally efficient and widely applied in biomedical research. M5. Model Selection: The function huge.select() provides two regularization parameter selection methods: the stability approach for regularization selection (StARS) (Liu et al., 2010); and rotation information criterion (RIC). We also provide a likelihood-based extended Bayesian information criterion. M6. Graph Visualization: The plotting functions huge.plot() and plot() provide visualizations of the simulated data sets, estimated graphs and paths. The implementation is based on the igraph package. 3. User Interface by Example We illustrate the user interface by analyzing a stock market data which we contribute to the huge package. We acquired closing prices from all stocks in the S&P; 500 for all the days that the market was open between Jan 1, 2003 and Jan 1, 2008. This gave us 1258 samples for the 452 stocks that remained in the S&P; 500 during the entire time period. > > > > > library(huge) data(stockdata) # Load the data x = log(stockdata$data[2:1258,]/stockdata$data[1:1257,]) # Preprocessing x.npn = huge.npn(x, npn.func=

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Journal of Machine Learning Research 13 (2012) 1059-1062 Submitted 8/11; Revised 2/12; Published 4/12 The huge Package for High-dimensional Undirected Graph Estimation in R Tuo Zhao Han Liu∗ TOURZHAO @ JHU . [sent-1, score-0.323]

2 EDU Department of Statistics Carnegie Mellon University Pittsburgh, PA, 15213 Editor: Mikio Braun Abstract We describe an R package named huge which provides easy-to-use functions for estimating high dimensional undirected graphs from data. [sent-10, score-0.897]

3 This package implements recent results in the literature, including Friedman et al. [sent-11, score-0.283]

4 Keywords: high-dimensional undirected graph estimation, glasso, huge, semiparametric graph estimation, data-dependent model selection, lossless screening, lossy screening 1. [sent-16, score-1.49]

5 Overview Undirected graphs is a natural approach to describe the conditional independence among many variables. [sent-17, score-0.067]

6 Each node of the graph represents a single variable and no edge between two variables implies that they are conditional independent given all other variables. [sent-18, score-0.147]

7 In the past decade, significant progress has been made on designing efficient algorithms to learn undirected graphs from high-dimensional observational data sets. [sent-19, score-0.261]

8 Most of these methods are based on either the penalized maximum-likelihood estimation (Friedman et al. [sent-20, score-0.077]

9 , 2007) or penalized regression methods (Meinshausen and B¨ hlmann, 2006). [sent-21, score-0.032]

10 Z HAO , L IU , ROEDER , L AFFERTY AND WASSERMAN lar, the glasso package has been widely adopted by statisticians and computer scientists due to its friendly user-inference and efficiency. [sent-28, score-0.56]

11 In this paper1 we describe a newly developed R package named huge (High-dimensional Undirected Graph Estimation) coded in C. [sent-29, score-0.648]

12 The package includes a wide range of functional modules and addresses some drawbacks of the graphical lasso algorithm. [sent-30, score-0.649]

13 To gain more scalability, the package supports two modes of screening, lossless (Witten et al. [sent-31, score-0.522]

14 When using lossy screening, the user can select the desired screening level to scale up for high-dimensional problems, but this introduces some estimation bias. [sent-33, score-0.714]

15 Software Design and Implementation The package huge aims to provide a general framework for high-dimensional undirected graph estimation. [sent-35, score-0.882]

16 The package includes Six functional modules (M1-M6) facilitate a flexible pipeline for analysis (Figure 1). [sent-36, score-0.34]

17 generator() can simulate multivariate Gaussian data with different undirected graphs, including hub, cluster, band, scale-free, and Erd¨ s-R´ nyi o e random graphs. [sent-39, score-0.27]

18 The sparsity level of the obtained graph and signal-to-noise ratio can also be set up by users. [sent-40, score-0.147]

19 , 2009, 2012) for estimating a semiparametric Gaussian copula model. [sent-44, score-0.283]

20 The nonparanormal family extends the Gaussian distribution by marginally transforming the variables. [sent-45, score-0.234]

21 Computationally, the nonparanormal transformation only requires one pass through the data matrix. [sent-46, score-0.269]

22 Graph Screening: The scr argument in the main function huge() controls the use of largescale correlation screening before graph estimation. [sent-48, score-0.535]

23 The function supports the lossless screening (Witten et al. [sent-49, score-0.636]

24 Such screening procedures can greatly reduce the computational cost and achieve equal or even better estimation by reducing the variance at the expense of increased bias. [sent-51, score-0.439]

25 Graph Estimation: Similar to the glasso package, the method argument in the huge() function supports two estimation methods: (i) the neighborhood pursuit algorithm (Meinshausen and B¨ hlmann, 2006) and (ii) the graphical lasso algorithm (Friedman et al. [sent-54, score-0.843]

26 We apply u the coordinate descent with active set and covariance update, as well as other tricks suggested in Friedman et al. [sent-56, score-0.053]

27 We modified the warm start trick to address the potential divergence problem of the graphical lasso algorithm (Mazumder and Hastie, 2011). [sent-58, score-0.34]

28 The code is also memory-optimized using the sparse matrix data structure when estimating and storing full regularization paths for large 1. [sent-59, score-0.113]

29 This paper is only a summary of the package huge. [sent-60, score-0.248]

30 we also provide a complementary graph estimation method based on thresholding the sample correlation matrix, which is computationally efficient and widely applied in biomedical research. [sent-63, score-0.225]

31 select() provides two regularization parameter selection methods: the stability approach for regularization selection (StARS) (Liu et al. [sent-66, score-0.124]

32 plot() and plot() provide visualizations of the simulated data sets, estimated graphs and paths. [sent-71, score-0.093]

33 User Interface by Example We illustrate the user interface by analyzing a stock market data which we contribute to the huge package. [sent-74, score-0.474]

34 We acquired closing prices from all stocks in the S&P; 500 for all the days that the market was open between Jan 1, 2003 and Jan 1, 2008. [sent-75, score-0.203]

35 This gave us 1258 samples for the 452 stocks that remained in the S&P; 500 during the entire time period. [sent-76, score-0.067]

36 4) Here the data have been transformed by calculating the log-ratio of the price at time t to price at time t − 1. [sent-85, score-0.058]

37 The nonparanormal transformation is applied to the data, and a graph is estimated using the graphical lasso (the default is the Meinshausen-B¨ hlmann estimator). [sent-86, score-0.784]

38 The program automatiu cally sets up a sequence of 40 regularization parameters and estimates the graph path. [sent-87, score-0.195]

39 Performance Benchmark To compare huge with glasso (ver 1. [sent-90, score-0.572]

40 We simulate the data from a normal distribution with the Erd¨ s-R´ nyi random graph structure (sparsity 1%). [sent-92, score-0.253]

41 Timings (in seconds) are computed over o e 10 values of the corresponding regularization parameter, and the range of regularization parameters is chosen so that each method produced approximately the same number of non-zero estimates. [sent-93, score-0.096]

42 The convergence threshold of both glasso and huge is chosen to be 10−4 . [sent-94, score-0.572]

43 2) were unable to obtain timing results due to their numerical instability. [sent-97, score-0.026]

44 For the neighborhood pursuit, we can see that huge achieves the best performance. [sent-98, score-0.359]

45 In particular, when the lossy screening rule is applied, huge automatically reduces each individual lasso problem from the original dimension d to the sample size n, therefore a better efficiency can be achieved when d is much larger than n. [sent-99, score-1.116]

46 Based on our experiments, the speed up due to the lossy screening rule can be up to more than 500%. [sent-100, score-0.632]

47 1061 Z HAO , L IU , ROEDER , L AFFERTY AND WASSERMAN Method huge-neighborhood pursuit (lossy) huge-neighborhood pursuit glasso-neighborhood pursuit huge-graphical lasso (lossy) huge-graphical lasso (lossless) glasso-graphical lasso d = 1000 n = 100 3. [sent-101, score-0.981]

48 9) Table 1: Experimental Results Unlike the neighborhood pursuit, the graphical lasso estimates the inverse covariance matrix. [sent-142, score-0.37]

49 , 2011) greatly reduces the computation required by the graphical lasso algorithm and gains an extra performance boost. [sent-144, score-0.381]

50 Summary and Acknowledgement We developed a new package named huge for high dimensional undirected graph estimation. [sent-146, score-0.977]

51 The package is complementary to the existing glasso package by providing extra features and functional modules. [sent-147, score-0.847]

52 We plan to maintain and support this package in the future. [sent-148, score-0.248]

53 Han Liu, John Lafferty, and Larry Wasserman are supported by NSF grant IIS-1116730 and AFOSR contract FA9550-09-1-0373. [sent-150, score-0.024]

54 Regularization paths for generalized linear models via coordinate descent. [sent-162, score-0.039]

55 The nonparanormal semiparametric estimation of high dimensional undirected graphs. [sent-168, score-0.653]

56 Stability approach to regularization selection for high dimensional graphical models. [sent-174, score-0.238]

57 High dimensional graphs and variable selection with the lasso. [sent-192, score-0.111]

58 New insights and faster computations for the graphical lasso. [sent-197, score-0.175]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('screening', 0.362), ('huge', 0.323), ('lossy', 0.27), ('glasso', 0.249), ('package', 0.248), ('lossless', 0.234), ('nonparanormal', 0.234), ('roeder', 0.21), ('pursuit', 0.166), ('semiparametric', 0.166), ('undirected', 0.164), ('lasso', 0.161), ('graph', 0.147), ('graphical', 0.146), ('liu', 0.127), ('copula', 0.117), ('witten', 0.111), ('lafferty', 0.105), ('ver', 0.1), ('wasserman', 0.092), ('friedman', 0.091), ('afferty', 0.078), ('covpath', 0.078), ('kathryn', 0.078), ('stockdata', 0.078), ('tuo', 0.078), ('han', 0.069), ('graphs', 0.067), ('stocks', 0.067), ('jhu', 0.067), ('nyi', 0.067), ('larry', 0.066), ('meinshausen', 0.063), ('cmu', 0.061), ('hlmann', 0.061), ('erd', 0.06), ('mazumder', 0.055), ('iu', 0.052), ('market', 0.052), ('hao', 0.052), ('named', 0.051), ('regularization', 0.048), ('zhao', 0.046), ('estimation', 0.045), ('dimensional', 0.044), ('extra', 0.042), ('johns', 0.042), ('hopkins', 0.04), ('hastie', 0.04), ('supports', 0.04), ('paths', 0.039), ('jan', 0.039), ('simulate', 0.039), ('modules', 0.039), ('user', 0.037), ('neighborhood', 0.036), ('transformation', 0.035), ('visualization', 0.035), ('implements', 0.035), ('warm', 0.033), ('corrected', 0.033), ('ric', 0.033), ('friendly', 0.033), ('timings', 0.033), ('raph', 0.033), ('hub', 0.033), ('stock', 0.033), ('complementary', 0.033), ('greatly', 0.032), ('penalized', 0.032), ('significant', 0.03), ('prices', 0.03), ('scientists', 0.03), ('summer', 0.03), ('insights', 0.029), ('interface', 0.029), ('price', 0.029), ('stability', 0.028), ('closing', 0.028), ('portable', 0.028), ('noah', 0.028), ('drawbacks', 0.028), ('band', 0.028), ('stars', 0.028), ('functional', 0.027), ('covariance', 0.027), ('code', 0.026), ('pipeline', 0.026), ('visualizations', 0.026), ('largescale', 0.026), ('timing', 0.026), ('tricks', 0.026), ('igh', 0.026), ('days', 0.026), ('coded', 0.026), ('load', 0.026), ('edu', 0.025), ('plotting', 0.024), ('baltimore', 0.024), ('contract', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 113 jmlr-2012-The huge Package for High-dimensional Undirected Graph Estimation in R

Author: Tuo Zhao, Han Liu, Kathryn Roeder, John Lafferty, Larry Wasserman

Abstract: We describe an R package named huge which provides easy-to-use functions for estimating high dimensional undirected graphs from data. This package implements recent results in the literature, including Friedman et al. (2007), Liu et al. (2009, 2012) and Liu et al. (2010). Compared with the existing graph estimation package glasso, the huge package provides extra features: (1) instead of using Fortan, it is written in C, which makes the code more portable and easier to modify; (2) besides fitting Gaussian graphical models, it also provides functions for fitting high dimensional semiparametric Gaussian copula models; (3) more functions like data-dependent model selection, data generation and graph visualization; (4) a minor convergence problem of the graphical lasso algorithm is corrected; (5) the package allows the user to apply both lossless and lossy screening rules to scale up large-scale problems, making a tradeoff between computational and statistical efficiency. Keywords: high-dimensional undirected graph estimation, glasso, huge, semiparametric graph estimation, data-dependent model selection, lossless screening, lossy screening 1. Overview Undirected graphs is a natural approach to describe the conditional independence among many variables. Each node of the graph represents a single variable and no edge between two variables implies that they are conditional independent given all other variables. In the past decade, significant progress has been made on designing efficient algorithms to learn undirected graphs from high-dimensional observational data sets. Most of these methods are based on either the penalized maximum-likelihood estimation (Friedman et al., 2007) or penalized regression methods (Meinshausen and B¨ hlmann, 2006). Existing packages include glasso, Covpath and CLIME. In particuu ∗. Also in the Department of Biostatistics. †. Also in the Department of Machine Learning. c 2012 Zhao, Liu, Roeder, Lafferty and Wasserman. Z HAO , L IU , ROEDER , L AFFERTY AND WASSERMAN lar, the glasso package has been widely adopted by statisticians and computer scientists due to its friendly user-inference and efficiency. In this paper1 we describe a newly developed R package named huge (High-dimensional Undirected Graph Estimation) coded in C. The package includes a wide range of functional modules and addresses some drawbacks of the graphical lasso algorithm. To gain more scalability, the package supports two modes of screening, lossless (Witten et al., 2011) and lossy screening. When using lossy screening, the user can select the desired screening level to scale up for high-dimensional problems, but this introduces some estimation bias. 2. Software Design and Implementation The package huge aims to provide a general framework for high-dimensional undirected graph estimation. The package includes Six functional modules (M1-M6) facilitate a flexible pipeline for analysis (Figure 1). M1. Data Generator: The function huge.generator() can simulate multivariate Gaussian data with different undirected graphs, including hub, cluster, band, scale-free, and Erd¨ s-R´ nyi o e random graphs. The sparsity level of the obtained graph and signal-to-noise ratio can also be set up by users. M2. Semiparametric Transformation: The function huge.npn() implements the nonparanormal method (Liu et al., 2009, 2012) for estimating a semiparametric Gaussian copula model.The nonparanormal family extends the Gaussian distribution by marginally transforming the variables. Computationally, the nonparanormal transformation only requires one pass through the data matrix. M3. Graph Screening: The scr argument in the main function huge() controls the use of largescale correlation screening before graph estimation. The function supports the lossless screening (Witten et al., 2011) and the lossy screening. Such screening procedures can greatly reduce the computational cost and achieve equal or even better estimation by reducing the variance at the expense of increased bias. Figure 1: The graph estimation pipeline. M4. Graph Estimation: Similar to the glasso package, the method argument in the huge() function supports two estimation methods: (i) the neighborhood pursuit algorithm (Meinshausen and B¨ hlmann, 2006) and (ii) the graphical lasso algorithm (Friedman et al., 2007). We apply u the coordinate descent with active set and covariance update, as well as other tricks suggested in Friedman et al. (2010). We modified the warm start trick to address the potential divergence problem of the graphical lasso algorithm (Mazumder and Hastie, 2011). The code is also memory-optimized using the sparse matrix data structure when estimating and storing full regularization paths for large 1. This paper is only a summary of the package huge. For more details please refer to the online vignette. 1060 H IGH - DIMENSIONAL U NDIRECTED G RAPH E STIMATION data sets. we also provide a complementary graph estimation method based on thresholding the sample correlation matrix, which is computationally efficient and widely applied in biomedical research. M5. Model Selection: The function huge.select() provides two regularization parameter selection methods: the stability approach for regularization selection (StARS) (Liu et al., 2010); and rotation information criterion (RIC). We also provide a likelihood-based extended Bayesian information criterion. M6. Graph Visualization: The plotting functions huge.plot() and plot() provide visualizations of the simulated data sets, estimated graphs and paths. The implementation is based on the igraph package. 3. User Interface by Example We illustrate the user interface by analyzing a stock market data which we contribute to the huge package. We acquired closing prices from all stocks in the S&P; 500 for all the days that the market was open between Jan 1, 2003 and Jan 1, 2008. This gave us 1258 samples for the 452 stocks that remained in the S&P; 500 during the entire time period. > > > > > library(huge) data(stockdata) # Load the data x = log(stockdata$data[2:1258,]/stockdata$data[1:1257,]) # Preprocessing x.npn = huge.npn(x, npn.func=

2 0.24144302 40 jmlr-2012-Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso

Author: Rahul Mazumder, Trevor Hastie

Abstract: We consider the sparse inverse covariance regularization problem or graphical lasso with regularization parameter λ. Suppose the sample covariance graph formed by thresholding the entries of the sample covariance matrix at λ is decomposed into connected components. We show that the vertex-partition induced by the connected components of the thresholded sample covariance graph (at λ) is exactly equal to that induced by the connected components of the estimated concentration graph, obtained by solving the graphical lasso problem for the same λ. This characterizes a very interesting property of a path of graphical lasso solutions. Furthermore, this simple rule, when used as a wrapper around existing algorithms for the graphical lasso, leads to enormous performance gains. For a range of values of λ, our proposal splits a large graphical lasso problem into smaller tractable problems, making it possible to solve an otherwise infeasible large-scale problem. We illustrate the graceful scalability of our proposal via synthetic and real-life microarray examples. Keywords: sparse inverse covariance selection, sparsity, graphical lasso, Gaussian graphical models, graph connected components, concentration graph, large scale covariance estimation

3 0.12299568 48 jmlr-2012-High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion

Author: Animashree Anandkumar, Vincent Y.F. Tan, Furong Huang, Alan S. Willsky

Abstract: We consider the problem of high-dimensional Gaussian graphical model selection. We identify a set of graphs for which an efficient estimation algorithm exists, and this algorithm is based on thresholding of empirical conditional covariances. Under a set of transparent conditions, we establish structural consistency (or sparsistency) for the proposed algorithm, when the number of −2 samples n = Ω(Jmin log p), where p is the number of variables and Jmin is the minimum (absolute) edge potential of the graphical model. The sufficient conditions for sparsistency are based on the notion of walk-summability of the model and the presence of sparse local vertex separators in the underlying graph. We also derive novel non-asymptotic necessary conditions on the number of samples required for sparsistency. Keywords: Gaussian graphical model selection, high-dimensional learning, local-separation property, walk-summability, necessary conditions for model selection

4 0.087644152 39 jmlr-2012-Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

Author: Jian Huang, Cun-Hui Zhang

Abstract: The ℓ1 -penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1 -penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1 -penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. Keywords: variable selection, penalized estimation, oracle inequality, generalized linear models, selection consistency, sparsity

5 0.085502625 2 jmlr-2012-A Comparison of the Lasso and Marginal Regression

Author: Christopher R. Genovese, Jiashun Jin, Larry Wasserman, Zhigang Yao

Abstract: The lasso is an important method for sparse, high-dimensional regression problems, with efficient algorithms available, a long history of practical success, and a large body of theoretical results supporting and explaining its performance. But even with the best available algorithms, finding the lasso solutions remains a computationally challenging task in cases where the number of covariates vastly exceeds the number of data points. Marginal regression, where each dependent variable is regressed separately on each covariate, offers a promising alternative in this case because the estimates can be computed roughly two orders faster than the lasso solutions. The question that remains is how the statistical performance of the method compares to that of the lasso in these cases. In this paper, we study the relative statistical performance of the lasso and marginal regression for sparse, high-dimensional regression problems. We consider the problem of learning which coefficients are non-zero. Our main results are as follows: (i) we compare the conditions under which the lasso and marginal regression guarantee exact recovery in the fixed design, noise free case; (ii) we establish conditions under which marginal regression provides exact recovery with high probability in the fixed design, noise free, random coefficients case; and (iii) we derive rates of convergence for both procedures, where performance is measured by the number of coefficients with incorrect sign, and characterize the regions in the parameter space recovery is and is not possible under this metric. In light of the computational advantages of marginal regression in very high dimensional problems, our theoretical and simulations results suggest that the procedure merits further study. Keywords: high-dimensional regression, lasso, phase diagram, regularization

6 0.050777029 7 jmlr-2012-A Multi-Stage Framework for Dantzig Selector and LASSO

7 0.044163816 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

8 0.042626947 19 jmlr-2012-An Introduction to Artificial Prediction Markets for Classification

9 0.042219341 117 jmlr-2012-Variable Selection in High-dimensional Varying-coefficient Models with Global Optimality

10 0.041711342 25 jmlr-2012-Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs

11 0.03557292 11 jmlr-2012-A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models

12 0.032323606 90 jmlr-2012-Pattern for Python

13 0.029817058 35 jmlr-2012-EP-GIG Priors and Applications in Bayesian Sparse Learning

14 0.02834532 111 jmlr-2012-Structured Sparsity and Generalization

15 0.027130324 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

16 0.025358563 119 jmlr-2012-glm-ie: Generalised Linear Models Inference & Estimation Toolbox

17 0.025253458 47 jmlr-2012-GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression

18 0.024766259 68 jmlr-2012-Minimax Manifold Estimation

19 0.024643587 26 jmlr-2012-Coherence Functions with Applications in Large-Margin Classification Methods

20 0.022989208 114 jmlr-2012-Towards Integrative Causal Analysis of Heterogeneous Data Sets and Studies


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.123), (1, 0.145), (2, 0.029), (3, -0.135), (4, 0.026), (5, 0.389), (6, -0.073), (7, -0.14), (8, -0.024), (9, -0.047), (10, 0.152), (11, 0.034), (12, -0.107), (13, -0.322), (14, -0.196), (15, -0.151), (16, 0.024), (17, 0.021), (18, 0.04), (19, 0.052), (20, 0.061), (21, 0.032), (22, 0.007), (23, 0.118), (24, -0.03), (25, 0.075), (26, 0.022), (27, 0.107), (28, 0.03), (29, 0.007), (30, -0.004), (31, -0.054), (32, -0.013), (33, -0.086), (34, -0.062), (35, -0.008), (36, -0.021), (37, 0.041), (38, 0.012), (39, -0.078), (40, 0.059), (41, 0.028), (42, 0.066), (43, -0.031), (44, 0.034), (45, -0.014), (46, 0.033), (47, 0.095), (48, -0.029), (49, 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98398417 113 jmlr-2012-The huge Package for High-dimensional Undirected Graph Estimation in R

Author: Tuo Zhao, Han Liu, Kathryn Roeder, John Lafferty, Larry Wasserman

Abstract: We describe an R package named huge which provides easy-to-use functions for estimating high dimensional undirected graphs from data. This package implements recent results in the literature, including Friedman et al. (2007), Liu et al. (2009, 2012) and Liu et al. (2010). Compared with the existing graph estimation package glasso, the huge package provides extra features: (1) instead of using Fortan, it is written in C, which makes the code more portable and easier to modify; (2) besides fitting Gaussian graphical models, it also provides functions for fitting high dimensional semiparametric Gaussian copula models; (3) more functions like data-dependent model selection, data generation and graph visualization; (4) a minor convergence problem of the graphical lasso algorithm is corrected; (5) the package allows the user to apply both lossless and lossy screening rules to scale up large-scale problems, making a tradeoff between computational and statistical efficiency. Keywords: high-dimensional undirected graph estimation, glasso, huge, semiparametric graph estimation, data-dependent model selection, lossless screening, lossy screening 1. Overview Undirected graphs is a natural approach to describe the conditional independence among many variables. Each node of the graph represents a single variable and no edge between two variables implies that they are conditional independent given all other variables. In the past decade, significant progress has been made on designing efficient algorithms to learn undirected graphs from high-dimensional observational data sets. Most of these methods are based on either the penalized maximum-likelihood estimation (Friedman et al., 2007) or penalized regression methods (Meinshausen and B¨ hlmann, 2006). Existing packages include glasso, Covpath and CLIME. In particuu ∗. Also in the Department of Biostatistics. †. Also in the Department of Machine Learning. c 2012 Zhao, Liu, Roeder, Lafferty and Wasserman. Z HAO , L IU , ROEDER , L AFFERTY AND WASSERMAN lar, the glasso package has been widely adopted by statisticians and computer scientists due to its friendly user-inference and efficiency. In this paper1 we describe a newly developed R package named huge (High-dimensional Undirected Graph Estimation) coded in C. The package includes a wide range of functional modules and addresses some drawbacks of the graphical lasso algorithm. To gain more scalability, the package supports two modes of screening, lossless (Witten et al., 2011) and lossy screening. When using lossy screening, the user can select the desired screening level to scale up for high-dimensional problems, but this introduces some estimation bias. 2. Software Design and Implementation The package huge aims to provide a general framework for high-dimensional undirected graph estimation. The package includes Six functional modules (M1-M6) facilitate a flexible pipeline for analysis (Figure 1). M1. Data Generator: The function huge.generator() can simulate multivariate Gaussian data with different undirected graphs, including hub, cluster, band, scale-free, and Erd¨ s-R´ nyi o e random graphs. The sparsity level of the obtained graph and signal-to-noise ratio can also be set up by users. M2. Semiparametric Transformation: The function huge.npn() implements the nonparanormal method (Liu et al., 2009, 2012) for estimating a semiparametric Gaussian copula model.The nonparanormal family extends the Gaussian distribution by marginally transforming the variables. Computationally, the nonparanormal transformation only requires one pass through the data matrix. M3. Graph Screening: The scr argument in the main function huge() controls the use of largescale correlation screening before graph estimation. The function supports the lossless screening (Witten et al., 2011) and the lossy screening. Such screening procedures can greatly reduce the computational cost and achieve equal or even better estimation by reducing the variance at the expense of increased bias. Figure 1: The graph estimation pipeline. M4. Graph Estimation: Similar to the glasso package, the method argument in the huge() function supports two estimation methods: (i) the neighborhood pursuit algorithm (Meinshausen and B¨ hlmann, 2006) and (ii) the graphical lasso algorithm (Friedman et al., 2007). We apply u the coordinate descent with active set and covariance update, as well as other tricks suggested in Friedman et al. (2010). We modified the warm start trick to address the potential divergence problem of the graphical lasso algorithm (Mazumder and Hastie, 2011). The code is also memory-optimized using the sparse matrix data structure when estimating and storing full regularization paths for large 1. This paper is only a summary of the package huge. For more details please refer to the online vignette. 1060 H IGH - DIMENSIONAL U NDIRECTED G RAPH E STIMATION data sets. we also provide a complementary graph estimation method based on thresholding the sample correlation matrix, which is computationally efficient and widely applied in biomedical research. M5. Model Selection: The function huge.select() provides two regularization parameter selection methods: the stability approach for regularization selection (StARS) (Liu et al., 2010); and rotation information criterion (RIC). We also provide a likelihood-based extended Bayesian information criterion. M6. Graph Visualization: The plotting functions huge.plot() and plot() provide visualizations of the simulated data sets, estimated graphs and paths. The implementation is based on the igraph package. 3. User Interface by Example We illustrate the user interface by analyzing a stock market data which we contribute to the huge package. We acquired closing prices from all stocks in the S&P; 500 for all the days that the market was open between Jan 1, 2003 and Jan 1, 2008. This gave us 1258 samples for the 452 stocks that remained in the S&P; 500 during the entire time period. > > > > > library(huge) data(stockdata) # Load the data x = log(stockdata$data[2:1258,]/stockdata$data[1:1257,]) # Preprocessing x.npn = huge.npn(x, npn.func=

2 0.88474327 40 jmlr-2012-Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso

Author: Rahul Mazumder, Trevor Hastie

Abstract: We consider the sparse inverse covariance regularization problem or graphical lasso with regularization parameter λ. Suppose the sample covariance graph formed by thresholding the entries of the sample covariance matrix at λ is decomposed into connected components. We show that the vertex-partition induced by the connected components of the thresholded sample covariance graph (at λ) is exactly equal to that induced by the connected components of the estimated concentration graph, obtained by solving the graphical lasso problem for the same λ. This characterizes a very interesting property of a path of graphical lasso solutions. Furthermore, this simple rule, when used as a wrapper around existing algorithms for the graphical lasso, leads to enormous performance gains. For a range of values of λ, our proposal splits a large graphical lasso problem into smaller tractable problems, making it possible to solve an otherwise infeasible large-scale problem. We illustrate the graceful scalability of our proposal via synthetic and real-life microarray examples. Keywords: sparse inverse covariance selection, sparsity, graphical lasso, Gaussian graphical models, graph connected components, concentration graph, large scale covariance estimation

3 0.58463293 48 jmlr-2012-High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion

Author: Animashree Anandkumar, Vincent Y.F. Tan, Furong Huang, Alan S. Willsky

Abstract: We consider the problem of high-dimensional Gaussian graphical model selection. We identify a set of graphs for which an efficient estimation algorithm exists, and this algorithm is based on thresholding of empirical conditional covariances. Under a set of transparent conditions, we establish structural consistency (or sparsistency) for the proposed algorithm, when the number of −2 samples n = Ω(Jmin log p), where p is the number of variables and Jmin is the minimum (absolute) edge potential of the graphical model. The sufficient conditions for sparsistency are based on the notion of walk-summability of the model and the presence of sparse local vertex separators in the underlying graph. We also derive novel non-asymptotic necessary conditions on the number of samples required for sparsistency. Keywords: Gaussian graphical model selection, high-dimensional learning, local-separation property, walk-summability, necessary conditions for model selection

4 0.31787392 2 jmlr-2012-A Comparison of the Lasso and Marginal Regression

Author: Christopher R. Genovese, Jiashun Jin, Larry Wasserman, Zhigang Yao

Abstract: The lasso is an important method for sparse, high-dimensional regression problems, with efficient algorithms available, a long history of practical success, and a large body of theoretical results supporting and explaining its performance. But even with the best available algorithms, finding the lasso solutions remains a computationally challenging task in cases where the number of covariates vastly exceeds the number of data points. Marginal regression, where each dependent variable is regressed separately on each covariate, offers a promising alternative in this case because the estimates can be computed roughly two orders faster than the lasso solutions. The question that remains is how the statistical performance of the method compares to that of the lasso in these cases. In this paper, we study the relative statistical performance of the lasso and marginal regression for sparse, high-dimensional regression problems. We consider the problem of learning which coefficients are non-zero. Our main results are as follows: (i) we compare the conditions under which the lasso and marginal regression guarantee exact recovery in the fixed design, noise free case; (ii) we establish conditions under which marginal regression provides exact recovery with high probability in the fixed design, noise free, random coefficients case; and (iii) we derive rates of convergence for both procedures, where performance is measured by the number of coefficients with incorrect sign, and characterize the regions in the parameter space recovery is and is not possible under this metric. In light of the computational advantages of marginal regression in very high dimensional problems, our theoretical and simulations results suggest that the procedure merits further study. Keywords: high-dimensional regression, lasso, phase diagram, regularization

5 0.30733198 39 jmlr-2012-Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

Author: Jian Huang, Cun-Hui Zhang

Abstract: The ℓ1 -penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1 -penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1 -penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. Keywords: variable selection, penalized estimation, oracle inequality, generalized linear models, selection consistency, sparsity

6 0.29900345 25 jmlr-2012-Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs

7 0.29649401 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

8 0.21794404 11 jmlr-2012-A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models

9 0.20555042 19 jmlr-2012-An Introduction to Artificial Prediction Markets for Classification

10 0.1808826 117 jmlr-2012-Variable Selection in High-dimensional Varying-coefficient Models with Global Optimality

11 0.17684102 47 jmlr-2012-GPLP: A Local and Parallel Computation Toolbox for Gaussian Process Regression

12 0.15982656 7 jmlr-2012-A Multi-Stage Framework for Dantzig Selector and LASSO

13 0.1555783 114 jmlr-2012-Towards Integrative Causal Analysis of Heterogeneous Data Sets and Studies

14 0.14580406 109 jmlr-2012-Stability of Density-Based Clustering

15 0.14433117 90 jmlr-2012-Pattern for Python

16 0.13031815 5 jmlr-2012-A Local Spectral Method for Graphs: With Applications to Improving Graph Partitions and Exploring Data Graphs Locally

17 0.12434412 72 jmlr-2012-Multi-Target Regression with Rule Ensembles

18 0.11913892 61 jmlr-2012-ML-Flex: A Flexible Toolbox for Performing Classification Analyses In Parallel

19 0.10910739 35 jmlr-2012-EP-GIG Priors and Applications in Bayesian Sparse Learning

20 0.10759182 101 jmlr-2012-SVDFeature: A Toolkit for Feature-based Collaborative Filtering


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.016), (21, 0.049), (26, 0.043), (27, 0.011), (29, 0.018), (49, 0.017), (56, 0.017), (57, 0.544), (69, 0.012), (75, 0.026), (79, 0.025), (92, 0.029), (96, 0.091)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.86489671 113 jmlr-2012-The huge Package for High-dimensional Undirected Graph Estimation in R

Author: Tuo Zhao, Han Liu, Kathryn Roeder, John Lafferty, Larry Wasserman

Abstract: We describe an R package named huge which provides easy-to-use functions for estimating high dimensional undirected graphs from data. This package implements recent results in the literature, including Friedman et al. (2007), Liu et al. (2009, 2012) and Liu et al. (2010). Compared with the existing graph estimation package glasso, the huge package provides extra features: (1) instead of using Fortan, it is written in C, which makes the code more portable and easier to modify; (2) besides fitting Gaussian graphical models, it also provides functions for fitting high dimensional semiparametric Gaussian copula models; (3) more functions like data-dependent model selection, data generation and graph visualization; (4) a minor convergence problem of the graphical lasso algorithm is corrected; (5) the package allows the user to apply both lossless and lossy screening rules to scale up large-scale problems, making a tradeoff between computational and statistical efficiency. Keywords: high-dimensional undirected graph estimation, glasso, huge, semiparametric graph estimation, data-dependent model selection, lossless screening, lossy screening 1. Overview Undirected graphs is a natural approach to describe the conditional independence among many variables. Each node of the graph represents a single variable and no edge between two variables implies that they are conditional independent given all other variables. In the past decade, significant progress has been made on designing efficient algorithms to learn undirected graphs from high-dimensional observational data sets. Most of these methods are based on either the penalized maximum-likelihood estimation (Friedman et al., 2007) or penalized regression methods (Meinshausen and B¨ hlmann, 2006). Existing packages include glasso, Covpath and CLIME. In particuu ∗. Also in the Department of Biostatistics. †. Also in the Department of Machine Learning. c 2012 Zhao, Liu, Roeder, Lafferty and Wasserman. Z HAO , L IU , ROEDER , L AFFERTY AND WASSERMAN lar, the glasso package has been widely adopted by statisticians and computer scientists due to its friendly user-inference and efficiency. In this paper1 we describe a newly developed R package named huge (High-dimensional Undirected Graph Estimation) coded in C. The package includes a wide range of functional modules and addresses some drawbacks of the graphical lasso algorithm. To gain more scalability, the package supports two modes of screening, lossless (Witten et al., 2011) and lossy screening. When using lossy screening, the user can select the desired screening level to scale up for high-dimensional problems, but this introduces some estimation bias. 2. Software Design and Implementation The package huge aims to provide a general framework for high-dimensional undirected graph estimation. The package includes Six functional modules (M1-M6) facilitate a flexible pipeline for analysis (Figure 1). M1. Data Generator: The function huge.generator() can simulate multivariate Gaussian data with different undirected graphs, including hub, cluster, band, scale-free, and Erd¨ s-R´ nyi o e random graphs. The sparsity level of the obtained graph and signal-to-noise ratio can also be set up by users. M2. Semiparametric Transformation: The function huge.npn() implements the nonparanormal method (Liu et al., 2009, 2012) for estimating a semiparametric Gaussian copula model.The nonparanormal family extends the Gaussian distribution by marginally transforming the variables. Computationally, the nonparanormal transformation only requires one pass through the data matrix. M3. Graph Screening: The scr argument in the main function huge() controls the use of largescale correlation screening before graph estimation. The function supports the lossless screening (Witten et al., 2011) and the lossy screening. Such screening procedures can greatly reduce the computational cost and achieve equal or even better estimation by reducing the variance at the expense of increased bias. Figure 1: The graph estimation pipeline. M4. Graph Estimation: Similar to the glasso package, the method argument in the huge() function supports two estimation methods: (i) the neighborhood pursuit algorithm (Meinshausen and B¨ hlmann, 2006) and (ii) the graphical lasso algorithm (Friedman et al., 2007). We apply u the coordinate descent with active set and covariance update, as well as other tricks suggested in Friedman et al. (2010). We modified the warm start trick to address the potential divergence problem of the graphical lasso algorithm (Mazumder and Hastie, 2011). The code is also memory-optimized using the sparse matrix data structure when estimating and storing full regularization paths for large 1. This paper is only a summary of the package huge. For more details please refer to the online vignette. 1060 H IGH - DIMENSIONAL U NDIRECTED G RAPH E STIMATION data sets. we also provide a complementary graph estimation method based on thresholding the sample correlation matrix, which is computationally efficient and widely applied in biomedical research. M5. Model Selection: The function huge.select() provides two regularization parameter selection methods: the stability approach for regularization selection (StARS) (Liu et al., 2010); and rotation information criterion (RIC). We also provide a likelihood-based extended Bayesian information criterion. M6. Graph Visualization: The plotting functions huge.plot() and plot() provide visualizations of the simulated data sets, estimated graphs and paths. The implementation is based on the igraph package. 3. User Interface by Example We illustrate the user interface by analyzing a stock market data which we contribute to the huge package. We acquired closing prices from all stocks in the S&P; 500 for all the days that the market was open between Jan 1, 2003 and Jan 1, 2008. This gave us 1258 samples for the 452 stocks that remained in the S&P; 500 during the entire time period. > > > > > library(huge) data(stockdata) # Load the data x = log(stockdata$data[2:1258,]/stockdata$data[1:1257,]) # Preprocessing x.npn = huge.npn(x, npn.func=

2 0.81066316 41 jmlr-2012-Exploration in Relational Domains for Model-based Reinforcement Learning

Author: Tobias Lang, Marc Toussaint, Kristian Kersting

Abstract: A fundamental problem in reinforcement learning is balancing exploration and exploitation. We address this problem in the context of model-based reinforcement learning in large stochastic relational domains by developing relational extensions of the concepts of the E 3 and R- MAX algorithms. Efficient exploration in exponentially large state spaces needs to exploit the generalization of the learned model: what in a propositional setting would be considered a novel situation and worth exploration may in the relational setting be a well-known context in which exploitation is promising. To address this we introduce relational count functions which generalize the classical notion of state and action visitation counts. We provide guarantees on the exploration efficiency of our framework using count functions under the assumption that we had a relational KWIK learner and a near-optimal planner. We propose a concrete exploration algorithm which integrates a practically efficient probabilistic rule learner and a relational planner (for which there are no guarantees, however) and employs the contexts of learned relational rules as features to model the novelty of states and actions. Our results in noisy 3D simulated robot manipulation problems and in domains of the international planning competition demonstrate that our approach is more effective than existing propositional and factored exploration techniques. Keywords: reinforcement learning, statistical relational learning, exploration, relational transition models, robotics

3 0.70269763 70 jmlr-2012-Multi-Assignment Clustering for Boolean Data

Author: Mario Frank, Andreas P. Streich, David Basin, Joachim M. Buhmann

Abstract: We propose a probabilistic model for clustering Boolean data where an object can be simultaneously assigned to multiple clusters. By explicitly modeling the underlying generative process that combines the individual source emissions, highly structured data are expressed with substantially fewer clusters compared to single-assignment clustering. As a consequence, such a model provides robust parameter estimators even when the number of samples is low. We extend the model with different noise processes and demonstrate that maximum-likelihood estimation with multiple assignments consistently infers source parameters more accurately than single-assignment clustering. Our model is primarily motivated by the task of role mining for role-based access control, where users of a system are assigned one or more roles. In experiments with real-world access-control data, our model exhibits better generalization performance than state-of-the-art approaches. Keywords: clustering, multi-assignments, overlapping clusters, Boolean data, role mining, latent feature models

4 0.301842 72 jmlr-2012-Multi-Target Regression with Rule Ensembles

Author: Timo Aho, Bernard Ženko, Sašo Džeroski, Tapio Elomaa

Abstract: Methods for learning decision rules are being successfully applied to many problem domains, in particular when understanding and interpretation of the learned model is necessary. In many real life problems, we would like to predict multiple related (nominal or numeric) target attributes simultaneously. While several methods for learning rules that predict multiple targets at once exist, they are all based on the covering algorithm, which does not work well for regression problems. A better solution for regression is the rule ensemble approach that transcribes an ensemble of decision trees into a large collection of rules. An optimization procedure is then used to select the best (and much smaller) subset of these rules and to determine their respective weights. We introduce the F IRE algorithm for solving multi-target regression problems, which employs the rule ensembles approach. We improve the accuracy of the algorithm by adding simple linear functions to the ensemble. We also extensively evaluate the algorithm with and without linear functions. The results show that the accuracy of multi-target regression rule ensembles is high. They are more accurate than, for instance, multi-target regression trees, but not quite as accurate as multi-target random forests. The rule ensembles are significantly more concise than random forests, and it is also possible to create compact rule sets that are smaller than a single regression tree but still comparable in accuracy. Keywords: multi-target prediction, rule learning, rule ensembles, regression ∗. Also in Microtask, Tampere, Finland. †. Also in the Centre of Excellence for Integrated Approaches in Chemistry and Biology of Proteins, Ljubljana, Slovenia. ‡. Also in the Centre of Excellence for Integrated Approaches in Chemistry and Biology of Proteins, Ljubljana, Slovenia and the Joˇ ef Stefan International Postgraduate School, Ljubljana, Slovenia. z ˇ c 2012 Timo Aho, Bernard Zenko, Saˇo Dˇ eroski and Tapio Elomaa. s z ˇ ˇ

5 0.28254983 116 jmlr-2012-Transfer in Reinforcement Learning via Shared Features

Author: George Konidaris, Ilya Scheidwasser, Andrew Barto

Abstract: We present a framework for transfer in reinforcement learning based on the idea that related tasks share some common features, and that transfer can be achieved via those shared features. The framework attempts to capture the notion of tasks that are related but distinct, and provides some insight into when transfer can be usefully applied to a problem sequence and when it cannot. We apply the framework to the knowledge transfer problem, and show that an agent can learn a portable shaping function from experience in a sequence of tasks to significantly improve performance in a later related task, even given a very brief training period. We also apply the framework to skill transfer, to show that agents can learn portable skills across a sequence of tasks that significantly improve performance on later related tasks, approaching the performance of agents given perfectly learned problem-specific skills. Keywords: reinforcement learning, transfer, shaping, skills

6 0.24715443 58 jmlr-2012-Linear Fitted-Q Iteration with Multiple Reward Functions

7 0.24533802 9 jmlr-2012-A Topic Modeling Toolbox Using Belief Propagation

8 0.23956338 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies

9 0.23856029 65 jmlr-2012-MedLDA: Maximum Margin Supervised Topic Models

10 0.23730242 86 jmlr-2012-Optimistic Bayesian Sampling in Contextual-Bandit Problems

11 0.23350808 11 jmlr-2012-A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models

12 0.22931793 57 jmlr-2012-Learning Symbolic Representations of Hybrid Dynamical Systems

13 0.22720867 85 jmlr-2012-Optimal Distributed Online Prediction Using Mini-Batches

14 0.22022396 114 jmlr-2012-Towards Integrative Causal Analysis of Heterogeneous Data Sets and Studies

15 0.21934634 75 jmlr-2012-NIMFA : A Python Library for Nonnegative Matrix Factorization

16 0.2193293 117 jmlr-2012-Variable Selection in High-dimensional Varying-coefficient Models with Global Optimality

17 0.21920627 27 jmlr-2012-Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection

18 0.21915111 40 jmlr-2012-Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso

19 0.21860838 80 jmlr-2012-On Ranking and Generalization Bounds

20 0.21741302 93 jmlr-2012-Quantum Set Intersection and its Application to Associative Memory