nips nips2009 nips2009-108 nips2009-108-reference knowledge-graph by maker-knowledge-mining

108 nips-2009-Heterogeneous multitask learning with joint sparsity constraints


Source: pdf

Author: Xiaolin Yang, Seyoung Kim, Eric P. Xing

Abstract: Multitask learning addresses the problem of learning related tasks that presumably share some commonalities on their input-output mapping functions. Previous approaches to multitask learning usually deal with homogeneous tasks, such as purely regression tasks, or entirely classification tasks. In this paper, we consider the problem of learning multiple related tasks of predicting both continuous and discrete outputs from a common set of input variables that lie in a highdimensional feature space. All of the tasks are related in the sense that they share the same set of relevant input variables, but the amount of influence of each input on different outputs may vary. We formulate this problem as a combination of linear regressions and logistic regressions, and model the joint sparsity as L1 /L∞ or L1 /L2 norm of the model parameters. Among several possible applications, our approach addresses an important open problem in genetic association mapping, where the goal is to discover genetic markers that influence multiple correlated traits jointly. In our experiments, we demonstrate our method in this setting, using simulated and clinical asthma datasets, and we show that our method can effectively recover the relevant inputs with respect to all of the tasks. 1


reference text

[1] A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243–272, 2008.

[2] B. Bakker and T. Heskes. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research, 4:83–99, 2003.

[3] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[4] R. Caruana. Multitask learning. Machine Learning, 28:41–75, 1997.

[5] V. Emilsson, G. Thorleifsson, B. Zhang, A.S. Leonardson, F. Zink, J. Zhu, S. Carlson, A. Helgason, G.B. Walters, S. Gunnarsdottir, et al. Variations in dna elucidate molecular networks that cause disease. Nature, 452(27):423–28, 2008.

[6] J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Technical Report 703, Department of Statistics, Stanford University, 2009.

[7] S. Kim and E. P. Xing. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genetics, 5(8):e1000587, 2009.

[8] K. Koh, S. Kim, and S. Boyd. An interior-point method for large-scale l1-regularized logistic regression. Journal of Machine Learning Research, 8(8):1519–1555, 2007.

[9] G. Obozinski, B. Taskar, and M. Jordan. Joint covariate selection for grouped classification. Technical Report 743, Department of Statistics, University of California, Berkeley, 2007.

[10] G. Obozinski, M.J. Wainwright, and M.J. Jordan. High-dimensional union support recovery in multivariate regression. In Advances in Neural Information Processing Systems 21, 2008.

[11] M. Schmidt, G. Fung, and R. Rosales. Fast optimization methods for l1 regularization: a comparative study and two new approaches. In Proceedings of the European Conference on Machine Learning, 2007.

[12] The International HapMap Consortium. A haplotype map of the human genome. Nature, 437:1399–1320, 2005.

[13] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, Series B, 58(1):267–288, 1996.

[14] K. Yu, V. Tresp, and A. Schwaighofer. Learning gaussian processes from multiple tasks. In Proceedings of the 22nd International Conference on Machine Learning, 2005.

[15] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of Royal Statistical Society, Series B, 68(1):49–67, 2006.

[16] J. Zhang, Z. Ghahramani, and Y. Yang. Flexible latent variable models for multi-task learning. Machine Learning, 73(3):221–242, 2008.

[17] P. Zhao, G. Rocha, and B. Yu. Grouped and hierarchical model selection through composite absolute penalties. Technical Report 703, Department of Statistics, University of California, Berkeley, 2008.

[18] J. Zhu, B. Zhang, E.N. Smith, B. Drees, R.B. Brem, L. Kruglyak, R.E. Bumgarner, and E.E. Schadt. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genetics, 40:854–61, 2008. 9