nips nips2007 nips2007-119 nips2007-119-reference knowledge-graph by maker-knowledge-mining

119 nips-2007-Learning with Tree-Averaged Densities and Distributions


Source: pdf

Author: Sergey Kirshner

Abstract: We utilize the ensemble of trees framework, a tractable mixture over superexponential number of tree-structured distributions [1], to develop a new model for multivariate density estimation. The model is based on a construction of treestructured copulas – multivariate distributions with uniform on [0, 1] marginals. By averaging over all possible tree structures, the new model can approximate distributions with complex variable dependencies. We propose an EM algorithm to estimate the parameters for these tree-averaged models for both the real-valued and the categorical case. Based on the tree-averaged framework, we propose a new model for joint precipitation amounts data on networks of rain stations. 1


reference text

[1] M. Meil˘ and T. Jaakkola. Tractable Bayesian learning of tree belief networks. Statistics and Computing, a 16(1):77–92, 2006.

[2] H. Joe. Multivariate Models and Dependence Concepts, volume 73 of Monographs on Statistics and Applied Probability. Chapman & Hall/CRC, 1997.

[3] R. B. Nelsen. An Introduction to Copulas. Springer Series in Statistics. Springer, 2nd edition, 2006.

[4] C. K. Chow and C. N. Liu. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, IT-14(3):462–467, May 1968.

[5] M. Meil˘ and M. I. Jordan. Learning with mixtures of trees. Journal of Machine Learning Research, a 1(1):1–48, October 2000. 7 −33 −2.6 Coastline Stations Selected pairs −34 −2.8 −35 Independent KDE Product KDE Gaussian Gaussian Copula Gaussian TCopula Frank TCopula Gaussian TACopula −2.9 −3 Latitude Log−likelihood per feature −2.7 −36 −3.1 −37 −3.2 50 100 200 500 1000 2000 5000 −38 10000 143 144 145 Training set size Figure 2: Averaged test set per-feature loglikelihood for MAGIC data: independent KDE (black solid ), product KDE (blue dashed ◦), Gaussian (brown solid ♦), Gaussian copula (orange solid +), Gaussian tree-copula (magenta dashed x), Frank tree-copula (blue dashed ), Gaussian tree-averaged copula (red solid x). 146 Longitude 147 148 149 150 Figure 3: Station map with station locations (red dots), coastline, and the pairs of stations selected according to Delaunay triangulation (dotted lines) 5 HMM−TA HMM−Tree HMM−CI y=x 0.7 HMM−TA HMM−Tree HMM−CI y=x 0.6 Kendall’s τ from the simulated data Log−odds from the simulated data 4.5 4 3.5 3 2.5 2 0.5 0.4 0.3 0.2 0.1 1.5 0 1 1 1.5 2 2.5 3 3.5 4 Log−odds from the historical data 4.5 5 0 0.1 0.2 0.3 0.4 0.5 0.6 Kendall’s τ from the historical data 0.7 Figure 4: Scatter-plots of log-odds ratios for occurrence (left) and Kendall’s τ measure of concordance (right) for all pairs of stations for the historical data vs HMM-TA (red o), HMM-Tree (blue x), and HMM-CI (green ·).

[6] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via EM algorithm. Journal of the Royal Statistical Society Series B-Methodological, 39(1):1–38, 1977.

[7] T. Bedford and R. M. Cooke. Vines – a new graphical model for dependent random variables. The Annals of Statistics, 30(4):1031–1068, 2002.

[8] H. Joe and J.J. Xu. The estimation method of inference functions for margins for multivariate models. Technical report, Department of Statistics, University of British Columbia, 1996.

[9] C. Genest, K. Ghoudi, and L.-P. Rivest. A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika, 82:543–552, 1995.

[10] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, Inc., San Francisco, California, 1988.

[11] J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society Series B-Methodological, 36(2):192–236, 1974.

[12] S. Kirshner. Learning with tree-averaged densities and distributions. Technical Report TR 08-01, Department of Computing Science, University of Alberta, 2008.

[13] A. Asuncion and D.J. Newman. UCI machine learning repository, 2007.

[14] E. Bellone. Nonhomogeneous Hidden Markov Models for Downscaling Synoptic Atmospheric Patterns to Precipitation Amounts. PhD thesis, Department of Statistics, University of Washington, 2000. 8