nips nips2010 nips2010-26 nips2010-26-reference knowledge-graph by maker-knowledge-mining

26 nips-2010-Adaptive Multi-Task Lasso: with Application to eQTL Detection

Source: pdf

Author: Seunghak Lee, Jun Zhu, Eric P. Xing

Abstract: To understand the relationship between genomic variations among population and complex diseases, it is essential to detect eQTLs which are associated with phenotypic effects. However, detecting eQTLs remains a challenge due to complex underlying mechanisms and the very large number of genetic loci involved compared to the number of samples. Thus, to address the problem, it is desirable to take advantage of the structure of the data and prior information about genomic locations such as conservation scores and transcription factor binding sites. In this paper, we propose a novel regularized regression approach for detecting eQTLs which takes into account related traits simultaneously while incorporating many regulatory features. We ﬁrst present a Bayesian network for a multi-task learning problem that includes priors on SNPs, making it possible to estimate the signiﬁcance of each covariate adaptively. Then we ﬁnd the maximum a posteriori (MAP) estimation of regression coefﬁcients and estimate weights of covariates jointly. This optimization procedure is efﬁcient since it can be achieved by using a projected gradient descent and a coordinate descent procedure iteratively. Experimental results on simulated and real yeast datasets conﬁrm that our model outperforms previous methods for ﬁnding eQTLs.

reference text

[1] R. Sladek, G. Rocheleau, J. Rung, C. Dina, L. Shen, D. Serre, P. Boutin, D. Vincent, A. Belisle, S. Hadjadj, et al. A genome-wide association study identiﬁes novel risk loci for type 2 diabetes. Nature, 445(7130):881–885, 2007.

[2] R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996.

[3] S.I. Lee, A.M. Dudley, D. Drubin, P.A. Silver, N.J. Krogan, D. Pe’er, and D. Koller. Learning a prior on regulatory potential from eQTL data. PLoS Genetics, 5(1):e1000358, 2009.

[4] S. Kim and E. P. Xing. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genetics, 5(8):e1000587, 2009.

[5] G. Obozinski, B. Taskar, and M. Jordan. Multi-task feature selection. In Technical Report, Department of Statistics, University of California, Berkeley, 2006.

[6] M. Szafranski, Y. Grandvalet, and P. Morizet-Mahoudeaux. Hierarchical penalization. Advances in Neural Information Processing Systems, 20:1457–1464, 2007.

[7] H. Zou. The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 101(476):1418–1429, 2006.

[8] S.I. Lee, V. Chatalbashev, D. Vickrey, and D. Koller. Learning a meta-level prior for feature relevance from multiple related tasks. In Proceedings of the 24th International Conference on Machine Learning, pages 489–496, 2007.

[9] T. Park and G. Casella. The bayesian Lasso. Journal of the American Statistical Association, 103(482):681–686, 2008.

[10] B. M. Marlin, M. Schmidt, and K. P. Murphy. Group sparse priors for covariance estimation. In Proceedings of the 25th Conference on Uncertainty in Artiﬁcial Intelligence, pages 383–392, 2009.

[11] E. G´ mez, M. A. Gomez-Viilegas, and J. M. Marin. A multivariate generalization of the o power exponential family of distributions. Communications in Statistics-Theory and Methods, 27(3):589–600, 1998.

[12] H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efﬁcient sparse coding algorithms. Advances in Neural Information Processing Systems, 19:801–808, 2007.

[13] J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efﬁcient projections onto the ℓ1 ball for learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning, pages 272–279, 2008.

[14] J. Friedman, T. Hastie, and R. Tibshirani. A note on the group Lasso and a sparse group Lasso. arXiv:1001.0736v1 [math.ST], 2010.

[15] T. T. Wu and K. Lange. Coordinate descent algorithms for Lasso penalized regression. Ann. Appl. Stat, 2(1):224–244, 2008.

[16] R. B. Brem and L. Kruglyak. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proceedings of the National Academy of Sciences of the United States of America, 102(5):1572–1577, 2005.

[17] S. Purcell, B. Neale, K. Todd-Brown, L. Thomas, M. A. R. Ferreira, D. Bender, J. Maller, P. Sklar, P. I. W. De Bakker, M. J. Daly, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics, 81(3):559– 575, 2007.

[18] G. Storz. An expanding universe of noncoding RNAs. Science, 296(5571):1260–1263, 2002.

[19] T. Miyake, J. Reese, C. M. Loch, D. T. Auble, and R. Li. Genome-wide analysis of ARS (autonomously replicating sequence) binding factor 1 (Abf1p)-mediated transcriptional regulation in Saccharomyces cerevisiae. Journal of Biological Chemistry, 279(33):34865–34872, 2004.

[20] G. Yvert, R. B. Brem, J. Whittle, J. M. Akey, E. Foss, E. N. Smith, R. Mackelprang, L. Kruglyak, et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nature Genetics, 35(1):57–64, 2003. 9