nips nips2013 nips2013-9 nips2013-9-reference knowledge-graph by maker-knowledge-mining

9 nips-2013-A Kernel Test for Three-Variable Interactions

Source: pdf

Author: Dino Sejdinovic, Arthur Gretton, Wicher Bergsma

Abstract: We introduce kernel nonparametric tests for Lancaster three-variable interaction and for total independence, using embeddings of signed measures into a reproducing kernel Hilbert space. The resulting test statistics are straightforward to compute, and are used in powerful interaction tests, which are consistent against all alternatives for a large family of reproducing kernels. We show the Lancaster test to be sensitive to cases where two independent causes individually have weak inﬂuence on a third dependent variable, but their combined effect has a strong inﬂuence. This makes the Lancaster test especially suited to ﬁnding structure in directed graphical models, where it outperforms competing nonparametric tests in detecting such V-structures.

reference text

[1] A. Gretton, O. Bousquet, A. Smola, and B. Sch¨ lkopf. Measuring statistical dependence with Hilberto Schmidt norms. In ALT, pages 63–78, 2005.

[2] G. Sz´ kely, M. Rizzo, and N.K. Bakirov. Measuring and testing dependence by correlation of distances. e Ann. Stat., 35(6):2769–2794, 2007. 8

[3] D. Sejdinovic, B. Sriperumbudur, A. Gretton, and K. Fukumizu. Equivalence of distance-based and RKHS-based statistics in hypothesis testing. Ann. Stat., 41(5):2263–2291, 2013.

[4] F. R. Bach and M. I. Jordan. Kernel independent component analysis. J. Mach. Learn. Res., 3:1–48, 2002.

[5] K. Fukumizu, F. Bach, and A. Gretton. Statistical consistency of kernel canonical correlation analysis. J. Mach. Learn. Res., 8:361–383, 2007.

[6] J. Dauxois and G. M. Nkiet. Nonlinear canonical analysis and independence tests. Ann. Stat., 26(4):1254– 1278, 1998.

[7] D. Pal, B. Poczos, and Cs. Szepesvari. Estimation of renyi entropy and mutual information based on generalized nearest-neighbor graphs. In NIPS 23, 2010.

[8] A. Kankainen. Consistent Testing of Total Independence Based on the Empirical Characteristic Function. PhD thesis, University of Jyv¨ skyl¨ , 1995. a a

[9] S. Bernstein. The Theory of Probabilities. Gastehizdat Publishing House, Moscow, 1946.

[10] M. Kayano, I. Takigawa, M. Shiga, K. Tsuda, and H. Mamitsuka. Efﬁciently ﬁnding genome-wide threeway gene interactions from transcript- and genotype-data. Bioinformatics, 25(21):2735–2743, 2009.

[11] N. Meinshausen and P. Buhlmann. High dimensional graphs and variable selection with the lasso. Ann. Stat., 34(3):1436–1462, 2006.

[12] P. Ravikumar, M.J. Wainwright, G. Raskutti, and B. Yu. High-dimensional covariance estimation by minimizing ℓ1 -penalized log-determinant divergence. Electron. J. Stat., 4:935–980, 2011.

[13] J. Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, 2001.

[14] P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction, and Search. 2nd edition, 2000.

[15] M. Kalisch and P. Buhlmann. Estimating high-dimensional directed acyclic graphs with the PC algorithm. J. Mach. Learn. Res., 8:613–636, 2007.

[16] X. Sun, D. Janzing, B. Sch¨ lkopf, and K. Fukumizu. A kernel-based causal learning algorithm. In ICML, o pages 855–862, 2007.

[17] R. Tillman, A. Gretton, and P. Spirtes. Nonlinear directed acyclic structure learning with weakly additive noise models. In NIPS 22, 2009.

[18] K. Zhang, J. Peters, D. Janzing, and B. Schoelkopf. Kernel-based conditional independence test and application in causal discovery. In UAI, pages 804–813, 2011.

[19] A. Gretton, K. Fukumizu, C.-H. Teo, L. Song, B. Sch¨ lkopf, and A. Smola. A kernel statistical test of o independence. In NIPS 20, pages 585–592, Cambridge, MA, 2008. MIT Press.

[20] K. Fukumizu, A. Gretton, X. Sun, and B. Sch¨ lkopf. Kernel measures of conditional dependence. In o NIPS 20, pages 489–496, 2008.

[21] H.O. Lancaster. The Chi-Squared Distribution. Wiley, London, 1969.

[22] B. Streitberg. Lancaster interactions revisited. Ann. Stat., 18(4):1878–1885, 1990.

[23] K. Fukumizu, B. Sriperumbudur, A. Gretton, and B. Schoelkopf. Characteristic kernels on groups and semigroups. In NIPS 21, pages 473–480, 2009.

[24] A. Kankainen. Consistent Testing of Total Independence Based on the Empirical Characteristic Function. PhD thesis, University of Jyv¨ skyl¨ , 1995. a a

[25] A. Berlinet and C. Thomas-Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer, 2004.

[26] B. Sriperumbudur, K. Fukumizu, and G. Lanckriet. Universality, characteristic kernels and rkhs embedding of measures. J. Mach. Learn. Res., 12:2389–2410, 2011.

[27] B. Sriperumbudur, A. Gretton, K. Fukumizu, G. Lanckriet, and B. Sch¨ lkopf. Hilbert space embeddings o and metrics on probability measures. J. Mach. Learn. Res., 11:1517–1561, 2010.

[28] A. Gretton, K. Borgwardt, M. Rasch, B. Sch¨ lkopf, and A. Smola. A kernel two-sample test. J. Mach. o Learn. Res., 13:723–773, 2012.

[29] D. Sejdinovic, A. Gretton, B. Sriperumbudur, and K. Fukumizu. Hypothesis testing using pairwise distances and associated kernels. In ICML, 2012.

[30] G. Sz´ kely and M. Rizzo. Testing for equal distributions in high dimension. InterStat, (5), November e 2004.

[31] L. Baringhaus and C. Franz. On a new multivariate two-sample test. J. Multivariate Anal., 88(1):190–206, 2004.

[32] G. Sz´ kely and M. Rizzo. Brownian distance covariance. Ann. Appl. Stat., 4(3):1233–1303, 2009. e

[33] R. Serﬂing. Approximation Theorems of Mathematical Statistics. Wiley, New York, 1980.

[34] T.P. Speed. Cumulants and partition lattices. Austral. J. Statist., 25:378–388, 1983.

[35] S. Holm. A simple sequentially rejective multiple test procedure. Scand. J. Statist., 6(2):65–70, 1979.

[36] A. Gretton, K. Fukumizu, Z. Harchaoui, and B. Sriperumbudur. A fast, consistent kernel two-sample test. In NIPS 22, Red Hook, NY, 2009. Curran Associates Inc. 9