jmlr jmlr2010 jmlr2010-111 jmlr2010-111-reference knowledge-graph by maker-knowledge-mining

111 jmlr-2010-Topology Selection in Graphical Models of Autoregressive Processes

Source: pdf

Author: Jitkomut Songsiri, Lieven Vandenberghe

Abstract: An algorithm is presented for topology selection in graphical models of autoregressive Gaussian time series. The graph topology of the model represents the sparsity pattern of the inverse spectrum of the time series and characterizes conditional independence relations between the variables. The method proposed in the paper is based on an ℓ1 -type nonsmooth regularization of the conditional maximum likelihood estimation problem. We show that this reduces to a convex optimization problem and describe a large-scale algorithm that solves the dual problem via the gradient projection method. Results of experiments with randomly generated and real data sets are also included. Keywords: graphical models, time series, topology selection, convex optimization

reference text

A. Abdelwahab, O. Amor, and T. Abdelwahed. The analysis of the interdependence structure in international ﬁnancial markets by graphical models. International Research Journal of Finance and Economics, 15:291–306, 2008. N. Bani Asadi, I. Rish, K. Scheinberg, D. Kanevsky, and B. Ramabhadran. A MAP approach to learning sparse Gaussian markov networks. In IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1721–1724, 2009. F. R. Bach and M. I. Jordan. Learning graphical models for stationary time series. IEEE Transactions on Signal Processing, 32:2189–2199, 2004. O. Banerjee, L. El Ghaoui, and A. d’Aspremont. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. Journal of Machine Learning Research, 9:485–516, 2008. J. Barzilai and J. M. Borwein. Two-point step size gradient methods. IMA Journal of Numerical Analysis, 8:141–148, 1988. A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009. D. P. Bertsekas. Nonlinear Programming. Athena Scientiﬁc, Nashua, New Hampshire, second edition, 1999. 2702 T OPOLOGY S ELECTION IN G RAPHICAL M ODELS OF AUTOREGRESSIVE P ROCESSES D.A. Bessler and J. Yang. The structure of interdependence in international stock markets. Journal of International Money and Finance, 22(2):261–287, 2003. E. G. Birgin, J. M. Mart´nez, and M. Raydan. Nonmonotone spectral projected gradient methods ı on convex sets. SIAM J. on Optimization, 10(4):1196–1211, 2000. E. G. Birgin, J. M. Mart´nez, and M. Raydan. Inexact spectral projected gradient methods on convex ı sets. IMA Journal of Numerical Analysis, 23:539–559, 2003. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, 2004. www.stanford.edu/˜boyd/cvxbook. D. R. Brillinger. Time series: Data analysis and theory. Holden-Day, San Francisco, California, expanded edition, 1981. J. Dahl, V. Roychowdhury, and L. Vandenberghe. Maximum-likelihood estimation of multivariate normal graphical models: large-scale numerical implementation and topology selection. Technical report, Electrical Engineering Department, UCLA, 2005. R. Dahlhaus. Graphical interaction models for multivariate time series. Metrika, 51(2):157–172, 2000. R. Dahlhaus, M. Eichler, and J. Sandk¨ hler. Identiﬁcation of synaptic connections in neural ensemu bles by graphical models. Journal of Neuroscience Methods, 77(1):93–107, 1997. A. P. Dempster. Covariance selection. Biometrics, 28:157–175, 1972. J. Duchi, S. Gould, and D. Koller. Projected subgradient methods for learning sparse Gaussians. In Proceeding of the Conference on Uncertainty in AI, 2008. M. Eichler. Fitting graphical interaction models to multivariate time serie. In Proceedings of the 22nd Conference on Uncertainty in Artiﬁcial Intelligence, 2006. M. Eichler. Testing nonparametric and semiparametric hypotheses in vector stationary processes. Journal of Multivariate Analysis, 99(5):968–1009, 2008. M. Eichler, R. Dahlhaus, and J. Sandk¨ hler. Partial correlation analysis for the identiﬁcation of u synaptic connections. Biological Cybernetics, 89(4):289–302, 2003. S. Feiler, K.G. M¨ ller, A. M¨ ller, R. Dahlhaus, and W. Eich. Using interaction graphs for analysing u u the therapy process. Psychother Psychosom, 74(2):93–99, 2005. M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright. Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing, 1(4):586–597, 2007. R. Fried and V. Didelez. Decomposability and selection of graphical models for multivariate time series. Biometrika, 90(2):251–267, 2003. J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432, 2008. 2703 S ONGSIRI AND VANDENBERGHE U. Gather, M. Imhoff, and R. Fried. Graphical models for multivariate time series from intensive care monitoring. Statistics in Medicine, 21(18):2685–2701, 2002. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, New York, 2nd edition, 2009. J. Z. Huang, N. Liu, M. Pourahmadi, and L. Liu. Covariance matrix selection and estimation via penalised normal likelihood. Biometrika, 93(1):85–98, 2006. Y. Kim, J. Kim, and Y. Kim. Blockwise sparse regression. Statistica Sinica, 16(2):375–390, 2006. S. L. Lauritzen. Graphical Models. Oxford University Press, Oxford, 1996. K. Lounici. Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electronic Journal of Statistics, 2:90–102, 2008. Z. Lu. Smooth optimization approach for sparse covariance selection. SIAM Journal on Optimization, 19:1807, 2009. Z. Lu. Adaptive ﬁrst-order methods for general sparse inverse covariance selection. SIAM Journal on Matrix Analysis and Applications, 2010. forthcoming. Z. Lu and Y. Zhang. An augmented lagrangian approach for sparse principal component analysis. 2009. Submitted. N. Meinshausen and P. B¨ hlmann. High-dimensional graphs and variable selection with the Lasso. u Annals of Statistics, 34(3):1436–1462, 2006. T.M. Mitchell, R. Hutchinson, R.S. Niculescu, F. Pereira, and X. Wang. Learning to decode cognitive states from brain images. Machine Learning, 57(1):145–175, 2004. Y. Nesterov. Introductory Lectures on Convex Optimization. Kluwer Academic Publishers, Dordrecht, The Netherlands, 2004. Y. Nesterov. Smooth minimization of non-smooth functions. Mathematical Programming Series A, 103:127–152, 2005. B. T. Polyak. Introduction to Optimization. Optimization Software, Inc., New York, 1987. J. Proakis. Digital Communications. McGraw-Hill, New York, fourth edition, 2001. R. Ravikumar, M. J. Wainwright, G. Raskutti, and B. Yu. High-dimensional covariance estimation by minimizing ℓ1 -penalized log-determinant divergence, 2008. URL http://www.citebase. org/abstract?id=oai:arXiv.org:0811.3628. Available from arxiv.org/abs/0811.3628. A. J. Rothman, P. J. Bickel, E. Levina, and J. Zhu. Sparse permutation invariant covariance estimation. Electronic Journal of Statistics, 2:494–515, 2008. R. Salvador, J. Suckling, C. Schwarzbauer, and E. Bullmore. Undirected graphs of frequencydependent functional connectivity in whole brain networks. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1457):937–946, 2005. 2704 T OPOLOGY S ELECTION IN G RAPHICAL M ODELS OF AUTOREGRESSIVE P ROCESSES K. Scheinberg and I. Rish. SINCO - a greedy coordinate ascent method for sparse inverse covariance selection problem. 2009. Submitted. J. Songsiri, J. Dahl, and L. Vandenberghe. Graphical models of autoregressive processes. In Y. Eldar and D. Palomar, editors, Convex Optimization in Signal Processing and Communications. Cambridge University Press, Cambridge, 2009. P. Stoica and R. L. Moses. Introduction to Spectral Analysis. Prentice Hall, London, 1997. R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996. J. Timmer, M. Lauk, S. H¨ ußler, V. Radt, B. K¨ ster, B. Hellwig, B. Guschlbauer, C.H. L¨ cking, a o u M. Eichler, and G. Deuschl. Cross-spectral analysis of tremor time series. Internation Journal of Bifurcation and Chaos in applied Sciences and Engineering, 10(11):2595–2610, 2000. K.-C. Toh. Primal-dual path-following algorithms for determinant maximization problems with linear matrix inequalities. Computational Optimization and Applications, 14:309–330, 1999. P. Tseng. On accelerated proximal gradient methods for convex-concave optimization. SIAM Journal on Optimization, 2008. Submitted. L. Vandenberghe, S. Boyd, and S.-P. Wu. Determinant maximization with linear matrix inequality constraints. SIAM J. on Matrix Analysis and Applications, 19(2):499–533, April 1998. S.J. Wright, R.D. Nowak, and M.A.T. Figueiredo. Sparse reconstruction by separable approximation. IEEE Transactions on Signal Processing, 57(7):2479–2493, 2009. M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B Statistical Methodology, 68(1):49–67, 2006. M. Yuan and Y. Lin. Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1):19, 2007. P. Zhao, G. Rocha, and B. Yu. The composite absolute penalties family for grouped and hierarchical variable selection. The Annals of Statistics, 37(6A):3468–3497, 2009. 2705