nips nips2013 nips2013-44 nips2013-44-reference knowledge-graph by maker-knowledge-mining

44 nips-2013-B-test: A Non-parametric, Low Variance Kernel Two-sample Test


Source: pdf

Author: Wojciech Zaremba, Arthur Gretton, Matthew Blaschko

Abstract: A family of maximum mean discrepancy (MMD) kernel two-sample tests is introduced. Members of the test family are called Block-tests or B-tests, since the test statistic is an average over MMDs computed on subsets of the samples. The choice of block size allows control over the tradeoff between test power and computation time. In this respect, the B-test family combines favorable properties of previously proposed MMD two-sample tests: B-tests are more powerful than a linear time test where blocks are just pairs of samples, yet they are more computationally efficient than a quadratic time test where a single large block incorporating all the samples is used to compute a U-statistic. A further important advantage of the B-tests is their asymptotically Normal null distribution: this is by contrast with the U-statistic, which is degenerate under the null hypothesis, and for which estimates of the null distribution are computationally demanding. Recent results on kernel selection for hypothesis testing transfer seamlessly to the B-tests, yielding a means to optimize test power via kernel choice. 1


reference text

[1] Bengt Von Bahr. On the convergence of moments in the central limit theorem. The Annals of Mathematical Statistics, 36(3):pp. 808–818, 1965.

[2] L. Baringhaus and C. Franz. On a new multivariate two-sample test. J. Multivariate Anal., 88:190–206, 2004.

[3] A. Berlinet and C. Thomas-Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer, 2004.

[4] Andrew C Berry. The accuracy of the gaussian approximation to the sum of independent variates. Transactions of the American Mathematical Society, 49(1):122–136, 1941.

[5] B. Efron and R. J. Tibshirani. An Introduction to the Bootstrap. Chapman & Hall, 1993.

[6] M. Fromont, B. Laurent, M. Lerasle, and P. Reynaud-Bouret. Kernels based tests with nonasymptotic bootstrap approaches for two-sample problems. In COLT, 2012.

[7] A Gretton, K Fukumizu, Z Harchaoui, and BK Sriperumbudur. A fast, consistent kernel twosample test. In Advances in Neural Information Processing Systems 22, pages 673–681, 2009.

[8] A. Gretton, K. Fukumizu, C.-H. Teo, L. Song, B. Sch¨ lkopf, and A. J. Smola. A kernel o statistical test of independence. In Advances in Neural Information Processing Systems 20, pages 585–592, Cambridge, MA, 2008. MIT Press.

[9] A Gretton, B Sriperumbudur, D Sejdinovic, H Strathmann, S Balakrishnan, M Pontil, and K Fukumizu. Optimal kernel choice for large-scale two-sample tests. In Advances in Neural Information Processing Systems 25, pages 1214–1222, 2012.

[10] Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Sch¨ lkopf, and Alexander o Smola. A kernel two-sample test. J. Mach. Learn. Res., 13:723–773, March 2012.

[11] Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Sch¨ lkopf, and Alexander J. o Smola. A kernel method for the two-sample-problem. In NIPS, pages 513–520, 2006.

[12] Z. Harchaoui, F. Bach, and E. Moulines. Testing for homogeneity with kernel Fisher discriminant analysis. In NIPS, pages 609–616. MIT Press, Cambridge, MA, 2008.

[13] H.-C. Ho and G. Shieh. Two-stage U-statistics for hypothesis testing. Scandinavian Journal of Statistics, 33(4):861–873, 2006.

[14] Norman Lloyd Johnson, Samuel Kotz, and Narayanaswamy Balakrishnan. Continuous univariate distributions. Distributions in statistics. Wiley, 2nd edition, 1994.

[15] A. Kleiner, A. Talwalkar, P. Sarkar, and M. I. Jordan. A scalable bootstrap for massive data. Journal of the Royal Statistical Society, Series B, In Press.

[16] Andrey N Kolmogorov. Sulla determinazione empirica di una legge di distribuzione. Giornale dellIstituto Italiano degli Attuari, 4(1):83–91, 1933.

[17] B Sch¨ lkopf. Support vector learning. Oldenbourg, M¨ nchen, Germany, 1997. o u

[18] D. Sejdinovic, A. Gretton, B. Sriperumbudur, and K. Fukumizu. Hypothesis testing using pairwise distances and associated kernels. In ICML, 2012.

[19] R. Serfling. Approximation Theorems of Mathematical Statistics. Wiley, New York, 1980.

[20] Nickolay Smirnov. Table for estimating the goodness of fit of empirical distributions. The Annals of Mathematical Statistics, 19(2):279–281, 1948.

[21] B. Sriperumbudur, A. Gretton, K. Fukumizu, G. Lanckriet, and B. Sch¨ lkopf. Hilbert space o embeddings and metrics on probability measures. Journal of Machine Learning Research, 11:1517–1561, 2010.

[22] G. Sz´ kely and M. Rizzo. Testing for equal distributions in high dimension. InterStat, (5), e November 2004.

[23] G. Sz´ kely, M. Rizzo, and N. Bakirov. Measuring and testing dependence by correlation of e distances. Ann. Stat., 35(6):2769–2794, 2007.

[24] M. Yamada, T. Suzuki, T. Kanamori, H. Hachiya, and M. Sugiyama. Relative density-ratio estimation for robust distribution comparison. Neural Computation, 25(5):1324–1370, 2013. 9