nips nips2010 nips2010-26 nips2010-26-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Seunghak Lee, Jun Zhu, Eric P. Xing
Abstract: To understand the relationship between genomic variations among population and complex diseases, it is essential to detect eQTLs which are associated with phenotypic effects. However, detecting eQTLs remains a challenge due to complex underlying mechanisms and the very large number of genetic loci involved compared to the number of samples. Thus, to address the problem, it is desirable to take advantage of the structure of the data and prior information about genomic locations such as conservation scores and transcription factor binding sites. In this paper, we propose a novel regularized regression approach for detecting eQTLs which takes into account related traits simultaneously while incorporating many regulatory features. We first present a Bayesian network for a multi-task learning problem that includes priors on SNPs, making it possible to estimate the significance of each covariate adaptively. Then we find the maximum a posteriori (MAP) estimation of regression coefficients and estimate weights of covariates jointly. This optimization procedure is efficient since it can be achieved by using a projected gradient descent and a coordinate descent procedure iteratively. Experimental results on simulated and real yeast datasets confirm that our model outperforms previous methods for finding eQTLs.
[1] R. Sladek, G. Rocheleau, J. Rung, C. Dina, L. Shen, D. Serre, P. Boutin, D. Vincent, A. Belisle, S. Hadjadj, et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature, 445(7130):881–885, 2007.
[2] R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996.
[3] S.I. Lee, A.M. Dudley, D. Drubin, P.A. Silver, N.J. Krogan, D. Pe’er, and D. Koller. Learning a prior on regulatory potential from eQTL data. PLoS Genetics, 5(1):e1000358, 2009.
[4] S. Kim and E. P. Xing. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genetics, 5(8):e1000587, 2009.
[5] G. Obozinski, B. Taskar, and M. Jordan. Multi-task feature selection. In Technical Report, Department of Statistics, University of California, Berkeley, 2006.
[6] M. Szafranski, Y. Grandvalet, and P. Morizet-Mahoudeaux. Hierarchical penalization. Advances in Neural Information Processing Systems, 20:1457–1464, 2007.
[7] H. Zou. The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 101(476):1418–1429, 2006.
[8] S.I. Lee, V. Chatalbashev, D. Vickrey, and D. Koller. Learning a meta-level prior for feature relevance from multiple related tasks. In Proceedings of the 24th International Conference on Machine Learning, pages 489–496, 2007.
[9] T. Park and G. Casella. The bayesian Lasso. Journal of the American Statistical Association, 103(482):681–686, 2008.
[10] B. M. Marlin, M. Schmidt, and K. P. Murphy. Group sparse priors for covariance estimation. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pages 383–392, 2009.
[11] E. G´ mez, M. A. Gomez-Viilegas, and J. M. Marin. A multivariate generalization of the o power exponential family of distributions. Communications in Statistics-Theory and Methods, 27(3):589–600, 1998.
[12] H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. Advances in Neural Information Processing Systems, 19:801–808, 2007.
[13] J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the ℓ1 ball for learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning, pages 272–279, 2008.
[14] J. Friedman, T. Hastie, and R. Tibshirani. A note on the group Lasso and a sparse group Lasso. arXiv:1001.0736v1 [math.ST], 2010.
[15] T. T. Wu and K. Lange. Coordinate descent algorithms for Lasso penalized regression. Ann. Appl. Stat, 2(1):224–244, 2008.
[16] R. B. Brem and L. Kruglyak. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proceedings of the National Academy of Sciences of the United States of America, 102(5):1572–1577, 2005.
[17] S. Purcell, B. Neale, K. Todd-Brown, L. Thomas, M. A. R. Ferreira, D. Bender, J. Maller, P. Sklar, P. I. W. De Bakker, M. J. Daly, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics, 81(3):559– 575, 2007.
[18] G. Storz. An expanding universe of noncoding RNAs. Science, 296(5571):1260–1263, 2002.
[19] T. Miyake, J. Reese, C. M. Loch, D. T. Auble, and R. Li. Genome-wide analysis of ARS (autonomously replicating sequence) binding factor 1 (Abf1p)-mediated transcriptional regulation in Saccharomyces cerevisiae. Journal of Biological Chemistry, 279(33):34865–34872, 2004.
[20] G. Yvert, R. B. Brem, J. Whittle, J. M. Akey, E. Foss, E. N. Smith, R. Mackelprang, L. Kruglyak, et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nature Genetics, 35(1):57–64, 2003. 9