nips nips2008 nips2008-112 nips2008-112-reference knowledge-graph by maker-knowledge-mining

112 nips-2008-Kernel Measures of Independence for non-iid Data


Source: pdf

Author: Xinhua Zhang, Le Song, Arthur Gretton, Alex J. Smola

Abstract: Many machine learning algorithms can be formulated in the framework of statistical independence such as the Hilbert Schmidt Independence Criterion. In this paper, we extend this criterion to deal with structured and interdependent observations. This is achieved by modeling the structures using undirected graphical models and comparing the Hilbert space embeddings of distributions. We apply this new criterion to independent component analysis and sequence clustering. 1


reference text

[1] Aaronson, J., Burton, R., Dehling, H., Gilat, D., Hill, T., & Weiss, B. (1996). Strong laws for L and U-statistics. Transactions of the American Mathematical Society, 348, 2845–2865.

[2] Altun, Y., Smola, A. J., & Hofmann, T. (2004). Exponential families for conditional random fields. In UAI.

[3] Bach, F. R., & Jordan, M. I. (2002). Kernel independent component analysis. JMLR, 3, 1–48.

[4] Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Roy. Stat. Soc. B, 36(B), 192–326.

[5] Borovkova, S., Burton, R., & Dehling, H. (2001). Limit theorems for functionals of mixing processes with applications to dimension estimation. Transactions of the American Mathematical Society, 353(11), 4261–4318.

[6] Gretton, A., Fukumizu, K., Teo, C.-H., Song, L., Sch¨ lkopf, B., & Smola, A. (2008). A kernel statistical o test of independence. Tech. Rep. 168, MPI for Biological Cybernetics.

[7] Gretton, A., Herbrich, R., Smola, A., Bousquet, O., & Sch¨ lkopf, B. (2005). Kernel methods for measuro ing independence. JMLR, 6, 2075–2129.

[8] Hammersley, J. M., & Clifford, P. E. (1971). Markov fields on finite graphs and lattices. Unpublished manuscript.

[9] Hosseni, S., & Jutten, C. (2003). On the separability of nonlinear mixtures of temporally correlated sources. IEEE Signal Processing Letters, 10(2), 43–46.

[10] Ng, A., Jordan, M., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. In NIPS.

[11] Nguyen, X., Wainwright, M. J., & Jordan, M. I. (2008). Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization. In NIPS.

[12] Shen, H., Jegelka, S., & Gretton, A. (submitted). Fast kernel-based independent component analysis. IEEE Transactions on Signal Processing.

[13] Song, L., Smola, A., Borgwardt, K., & Gretton, A. (2007). Colored maximum variance unfolding. In NIPS.

[14] Song, L., Smola, A., Gretton, A., & Borgwardt, K. (2007). A dependence maximization view of clustering. In Proc. Intl. Conf. Machine Learning.

[15] Song, L., Smola, A., Gretton, A., Borgwardt, K., & Bedo, J. (2007). Supervised feature selection via dependence estimation. In ICML.

[16] Sriperumbudur, B., Gretton, A., Fukumizu, K., Lanckriet, G., & Sch¨ lkopf, B. (2008). Injective hilbert o space embeddings of probability measures. In COLT.

[17] Steinwart, I. (2002). The influence of the kernel on the consistency of support vector machines. JMLR, 2.

[18] Ziehe, A., & M¨ ller, K.-R. (1998). TDSEP – an efficient algorithm for blind separation using time u structure. In ICANN.