nips nips2010 nips2010-205 nips2010-205-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Malik Magdon-Ismail
Abstract: We define a data dependent permutation complexity for a hypothesis set H, which is similar to a Rademacher complexity or maximum discrepancy. The permutation complexity is based (like the maximum discrepancy) on dependent sampling. We prove a uniform bound on the generalization error, as well as a concentration result which means that the permutation estimate can be efficiently estimated.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Aut. Cont., 19, 716–723. Asuncion, A. and Newman, D. (2007). UCI machine learning repository. Bartlett, P. L. and Mendelson, S. (2002). Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482. Bartlett, P. L., Boucheron, S., and Lugosi, G. (2002). Model selection and error estimation. Machine Learning, 48, 85–113. Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions. Numerische Mathematik, 31, 377–403. Cureton, E. E. (1951). Symposium: The need and means of cross-validation: II approximate linear restraints and best predictor weights. Education and Psychology Measurement, 11, 12–15. 8 Efron, B. (2004). The estimation of prediction error: Covariance penalties and cross-validation. Journal of the American Statistical Association, 99(467), 619–632. Fromont, M. (2007). Model selection by bootstrap penalization for classification. Machine Learning, 66(2-3), 165–207. Gin´ , E. and Zinn, J. (1984). Some limit theorems for empirical processes. Annals of Prob., 12, e 929–989. Golland, P., Liang, F., Mukherjee, S., and Panchenko, D. (2005). Permutation tests for classification. Learning Theory, pages 501–515. Good, P. (2005). Permutation, parametric, and bootstrap tests of hypotheses. Springer. K¨ ari¨ inen, M. and Elomaa, T. (2003). Rademacher penalization over decision tree prunings. In In a¨ a Proc. 14th European Conference on Machine Learning, pages 193–204. Katzell, R. A. (1951). Symposium: The need and means of cross-validation: III cross validation of item analyses. Education and Psychology Measurement, 11, 16–22. Koltchinskii, V. (2001). Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory, 47(5), 1902–1914. Koltchinskii, V. and Panchenko, D. (2000). Rademacher processes and bounding the risk of function learning. In E. Gine, D. Mason, and J. Wellner, editors, High Dimensional Prob. II, volume 47, pages 443–459. Larson, S. C. (1931). The shrinkage of the coefficient of multiple correlation. Journal of Education Psychology, 22, 45–55. Lozano, F. (2000). Model selection using Rademacher penalization. In Proc. 2nd ICSC Symp. on Neural Comp. Lugosi, G. and Nobel, A. (1999). Adaptive model selection using empirical complexities. Annals of Statistics, 27, 1830–1864. Magdon-Ismail, M. and Mertsalov, K. (2010). A permutation approach to validation. In Proc. 10th SIAM International Conference on Data Mining (SDM). Massart, P. (2000). Some applications of concentration inequalities to statistics. Annales de la Facult´ des Sciencies de Toulouse, X, 245–303. e McDiarmid, C. (1989). On the method of bounded differences. In Surveys in Combinatorics, pages 148–188. Cambridge University Press. Mosier, C. I. (1951). Symposium: The need and means of cross-validation: I problem and designs of cross validation. Education and Psychology Measurement, 11, 5–11. Shawe-Taylor, J. and Cristianini, N. (2004). Kernel Methods for Pattern Analysis. Camb. Univ. Press. Shawe-Taylor, J., Bartlett, P. L., Williamson, R. C., and Anthony, M. (1998). Structural risk minimization over data dependent hierarchies. IEEE Transactions on Information Theory, 44, 1926– 1940. Stone, M. (1974). Cross validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, 36(2), 111–147. Vapnik, V. N. and Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their pr obabilities. Theory of Probability and its Applications, 16, 264–280. Vapnik, V. N., Levin, E., and Le Cun, Y. (1994). Measuring the VC-dimension of a learning machine. Neural Computation, 6(5), 851–876. Wang, J. and Shen, X. (2006). Estimation of generalization error: random and fixed inputs. Statistica Sinica, 16, 569–588. Wherry, R. J. (1931). A new formula for predicting the shrinkage of the multiple correlation coefficient. Annals of Mathematical Statistics, 2, 440–457. Wherry, R. J. (1951). Symposium: The need and means of cross-validation: III comparison of cross validation with statistical inference of betas and multiple r from a single sample. Education and Psychology Measurement, 11, 23–28. Wiklund, S., Nilsson, D., Eriksson, L., Sjostrom, M., Wold, S., and Faber, K. (2007). A randomization test for PLS component selection. Journal of Chemometrics, 21(10-11), 427–439. 9