jmlr jmlr2013 jmlr2013-96 jmlr2013-96-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Samuel Gerber, Ross Whitaker
Abstract: Principal curves and manifolds provide a framework to formulate manifold learning within a statistical context. Principal curves define the notion of a curve passing through the middle of a distribution. While the intuition is clear, the formal definition leads to some technical and practical difficulties. In particular, principal curves are saddle points of the mean-squared projection distance, which poses severe challenges for estimation and model selection. This paper demonstrates that the difficulties in model selection associated with the saddle point property of principal curves are intrinsically tied to the minimization of the mean-squared projection distance. We introduce a new objective function, facilitated through a modification of the principal curve estimation approach, for which all critical points are principal curves and minima. Thus, the new formulation removes the fundamental issue for model selection in principal curve estimation. A gradient-descent-based estimator demonstrates the effectiveness of the new formulation for controlling model complexity on numerical experiments with synthetic and real data. Keywords: principal curve, manifold estimation, unsupervised learning, model complexity, model selection
M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373–1396, 2003. G. Biau and A. Fischer. Parameter selection for principal curves. IEEE Transactions On Information Theory, 58(3):1924 –1939, 2012. K-Y. Chang and J. Ghosh. A unified model for probabilistic principal surfaces. IEEE Transaction On Pattern Analysis And Machine Intelligence, 23(1):22–41, 2001. W. S. Cleveland. Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74(368):829–836, 1979. T. Duchamp and W. Stuetzle. Extremal properties of principal curves in the plane. The Annals of Statistics, 24(4):1511–1520, 1996. H. Federer. Geometric Measure Theory. Springer, NY, 1969. H. Flanders. Differentiation under the integral sign. The American Mathematical Monthly, 80(6): 615–627, 1973. S. Gerber. Cems: Conditional Expectation Manifolds, http://CRAN.R-project.org/package=cems. R package version 0.1. 2011. URL S. Gerber, T. Tasdizen, and R. Whitaker. Dimensionality reduction and principal surfaces via kernel map manifolds. In IEEE 12th International Conference on Computer Vision, pages 529–536, 2009. W. H¨ rdle and J. S. Marron. Optimal bandwidth selection in nonparametric regression function a estimation. Annals of Statistics, 13(4), 1985. T. Hastie and W. Stuetzle. Principal curves. Journal of the American Statistical Association, 84 (406):502–516, 1989. 1301 G ERBER AND W HITAKER G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504 – 507, 2006. B. K´ gl, A. Krzyzak, T. Linder, and K. Zeger. Learning and design of principal curves. IEEE e Transaction On Pattern Analysis Machine Intelligence, 22(3):281–297, 2000. T. Kohonen. Self-organized Formation of Topologically Correct Feature Maps, pages 509–521. MIT Press, Cambridge, MA, USA, 1988. P. Meinicke, S. Klanke, R. Memisevic, and H. Ritter. Principal surfaces from unsupervised kernel regression. IEEE Transactions on Pattern Analysis And Machine Intelligence, 27(9):1379–1391, 2005. ISSN 0162-8828. S. Mukherjee, P. Niyogi, T. Poggio, and R. Rifkin. Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Advances in Computational Mathematics, 25:161–193, 2006. E. A. Nadaraya. On estimating regression. Theory of Probability and its Applications, 9(1):141– 142, 1964. S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(550), 2000. A. J. Smola, S. Mika, B. Sch¨ lkopf, and R. C. Williamson. Regularized principal manifolds. Journal o Of Machine Learning Research, 1:179–209, 2001. J. B. Tenenbaum, V. de Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(550):2319–2323, 2000. R. Tibshirani. Principal curves revisited. Statistics and Computing, 2:183–190, 1992. H. Wang and T. C. M. Lee. Automatic parameter selection for a k-segments algorithm for computing principal curves. Pattern Recognition Letteerss, 27(10):1142–1150, 2006. G. Watson. Smooth regression analysis. Sankhya, Series, A(26):359–372, 1964. 1302