nips nips2008 nips2008-29 nips2008-29-reference knowledge-graph by maker-knowledge-mining

29 nips-2008-Automatic online tuning for fast Gaussian summation

Source: pdf

Author: Vlad I. Morariu, Balaji V. Srinivasan, Vikas C. Raykar, Ramani Duraiswami, Larry S. Davis

Abstract: Many machine learning algorithms require the summation of Gaussian kernel functions, an expensive operation if implemented straightforwardly. Several methods have been proposed to reduce the computational complexity of evaluating such sums, including tree and analysis based methods. These achieve varying speedups depending on the bandwidth, dimension, and prescribed error, making the choice between methods difﬁcult for machine learning tasks. We provide an algorithm that combines tree methods with the Improved Fast Gauss Transform (IFGT). As originally proposed the IFGT suffers from two problems: (1) the Taylor series expansion does not perform well for very low bandwidths, and (2) parameter selection is not trivial and can drastically affect performance and ease of use. We address the ﬁrst problem by employing a tree data structure, resulting in four evaluation methods whose performance varies based on the distribution of sources and targets and input parameters such as desired accuracy and bandwidth. To solve the second problem, we present an online tuning approach that results in a black box method that automatically chooses the evaluation method and its parameters to yield the best performance for the input data, desired accuracy, and bandwidth. In addition, the new IFGT parameter selection approach allows for tighter error bounds. Our approach chooses the fastest method at negligible additional cost, and has superior performance in comparisons with previous approaches. 1

reference text

[1] M.P. Wand and M.C. Jones. Kernel Smoothing. Chapman and Hall, 1995.

[2] C. K. I. Williams and C. E. Rasmussen. Gaussian processes for regression. In NIPS, 1995.

[3] M. Klaas, M. Briers, N. de Freitas, A. Doucet, S. Maskell, and D. Lang. Fast particle smoothing: if I had a million particles. In ICML, 2006.

[4] N. de Freitas, Y. Wang, M. Mahdaviani, and D. Lang. Fast Krylov methods for N-body learning. In NIPS, 2006.

[5] L. Greengard and J. Strain. The fast Gauss transform. SIAM J. Sci. Stat. Comput., 1991.

[6] C. Yang, R. Duraiswami, N. A. Gumerov, and L. S. Davis. Improved fast Gauss transform and efﬁcient kernel density estimation. In ICCV, 2003.

[7] A. G. Gray and A. W. Moore. ‘N-body’ problems in statistical learning. In NIPS, 2000.

[8] A. G. Gray and A. W. Moore. Nonparametric density estimation: Toward computational tractability. In SIAM Data Mining, 2003.

[9] D. Lee, A. Gray, and A. Moore. Dual-tree fast Gauss transforms. In NIPS, 2006.

[10] D. Lee and A. G. Gray. Faster Gaussian summation: Theory and experiment. In UAI, 2006.

[11] B. W. Silverman. Density estimation for statistics and data analysis. Chapman and Hal, 1986.

[12] C. Yang, R. Duraiswami, and L. S. Davis. Efﬁcient kernel machines using the improved fast Gauss transform. In NIPS, 2004.

[13] V. Raykar, C. Yang, R. Duraiswami, and N. Gumerov. Fast computation of sums of Gaussians in high dimensions. UMD-CS-TR-4767, 2005.

[14] D. Lang, M. Klaas, and N. de Freitas. Empirical testing of fast kernel density estimation algorithms. Technical Report UBC TR-2005-03, University of British Columbia, Vancouver, 2005.

[15] S. Arya and D. Mount. Approximate nearest neighbor queries in ﬁxed dimensions. In SODA, 1993.

[16] S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu. An optimal algorithm for approximate nearest neighbor searching ﬁxed dimensions. Journal of the ACM, 1998.

[17] M. Frigo and S. G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 2005.

[18] R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1–2):3–35, 2001.

[19] T. F. Gonzalez. Clustering to minimize the maximum inter–cluster distance. In Journal of Theoretical Computer Science, number 38, pages 293 – 306, October 1985.

[20] T. Feder and D. H. Greene. Optimal algorithms for approximate clustering. In STOC, 1988.

[21] C. Faloutsos, B. Seeger, A. Traina, and C. Traina. Spatial join selectivity using power laws. In SIGMOD Conference, 2000.

[22] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2006.

[23] V. Simoncini and D. Szyld. Theory of inexact Krylov subspace methods and applications to scientiﬁc computing. Technical Report 02-4-12, Temple University, 2002.