nips nips2010 nips2010-232 nips2010-232-reference knowledge-graph by maker-knowledge-mining

232 nips-2010-Sample Complexity of Testing the Manifold Hypothesis

Source: pdf

Author: Hariharan Narayanan, Sanjoy Mitter

Abstract: The hypothesis that high dimensional data tends to lie in the vicinity of a low dimensional manifold is the basis of a collection of methodologies termed Manifold Learning. In this paper, we study statistical aspects of the question of ﬁtting a manifold with a nearly optimal least squared error. Given upper bounds on the dimension, volume, and curvature, we show that Empirical Risk Minimization can produce a nearly optimal manifold using a number of random samples that is independent of the ambient dimension of the space in which data lie. We obtain an upper bound on the required number of samples that depends polynomially on the curvature, exponentially on the intrinsic dimension, and linearly on the intrinsic volume. For constant error, we prove a matching minimax lower bound on the sample complexity that shows that this dependence on intrinsic dimension, volume log 1 and curvature is unavoidable. Whether the known lower bound of O( k + 2 δ ) 2 for the sample complexity of Empirical Risk minimization on k−means applied to data in a unit ball of arbitrary dimension is tight, has been an open question since 1997 [3]. Here is the desired bound on the error and δ is a bound on the probability of failure. We improve the best currently known upper bound [14] of 2 log 1 log4 k log 1 O( k2 + 2 δ ) to O k min k, 2 + 2 δ . Based on these results, we 2 devise a simple algorithm for k−means and another that uses a family of convex programs to ﬁt a piecewise linear curve of a speciﬁed length to high dimensional data, where the sample complexity is independent of the ambient dimension. 1

reference text

[1] Noga Alon, Shai Ben-David, Nicol` Cesa-Bianchi, and David Haussler. Scale-sensitive dio mensions, uniform convergence, and learnability. J. ACM, 44(4):615–631, 1997.

[2] Rosa I. Arriaga and Santosh Vempala. An algorithmic theory of learning: Robust concepts and random projection. In FOCS, pages 616–623, 1999.

[3] Peter Bartlett. The minimax distortion redundancy in empirical quantizer design. IEEE Transactions on Information Theory, 44:1802–1813, 1997.

[4] Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15(6):1373–1396, 2003.

[5] Shai Ben-David. A framework for statistical clustering with constant time approximation algorithms for k-median and k-means clustering. Mach. Learn., 66(2-3):243–257, 2007.

[6] Gunnar Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46:255– 308, January 2009.

[7] Sanjoy Dasgupta. Learning mixtures of gaussians. In FOCS, pages 634–644, 1999.

[8] David L. Donoho and Carrie Grimes. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences, 100(10):5591–5596, May 2003.

[9] A. Gray. Tubes. Addison-Wesley, 1990.

[10] Trevor J. Hastie and Werner Stuetzle. Principal curves. Journal of the American Statistical Association, 84:502–516, 1989.

[11] William Johnson and Joram Lindenstrauss. Extensions of lipschitz mappings into a hilbert space. Contemporary Mathematics, 26:419–441, 1984.

[12] Bal´ zs K´ gl, Adam Krzyzak, Tam´ s Linder, and Kenneth Zeger. Learning and design of prina e a cipal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:281–297, 2000.

[13] Amit Kumar, Yogish Sabharwal, and Sandeep Sen. A simple linear time (1+ )−approximation algorithm for k-means clustering in any dimensions. In FOCS, pages 454–462, 2004.

[14] Andreas Maurer and Massimiliano Pontil. Generalization bounds for k-dimensional coding schemes in hilbert spaces. In ALT, pages 79–91, 2008.

[15] H. Narayanan and P. Niyogi. On the sample complexity of learning smooth cuts on a manifold. In Proc. of the 22nd Annual Conference on Learning Theory (COLT), June 2009.

[16] Partha Niyogi, Stephen Smale, and Shmuel Weinberger. Finding the homology of submanifolds with high conﬁdence from random samples. Discrete & Computational Geometry, 39(13):419–441, 2008.

[17] Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. SCIENCE, 290:2323–2326, 2000.

[18] Alexander J. Smola, Sebastian Mika, Bernhard Sch¨ lkopf, and Robert C. Williamson. Reguo larized principal manifolds. J. Mach. Learn. Res., 1:179–209, 2001.

[19] J. B. Tenenbaum, V. Silva, and J. C. Langford. A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290(5500):2319–2323, 2000.

[20] Afra Zomorodian and Gunnar Carlsson. Computing persistent homology. Discrete & Computational Geometry, 33(2):249–274, 2005. 9