jmlr jmlr2013 jmlr2013-97 jmlr2013-97-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Chao Zhang, Dacheng Tao
Abstract: L´ vy processes refer to a class of stochastic processes, for example, Poisson processes and Browe nian motions, and play an important role in stochastic processes and machine learning. Therefore, it is essential to study risk bounds of the learning process for time-dependent samples drawn from a L´ vy process (or briefly called learning process for L´ vy process). It is noteworthy that samples e e in this learning process are not independently and identically distributed (i.i.d.). Therefore, results in traditional statistical learning theory are not applicable (or at least cannot be applied directly), because they are obtained under the sample-i.i.d. assumption. In this paper, we study risk bounds of the learning process for time-dependent samples drawn from a L´ vy process, and then analyze the e asymptotical behavior of the learning process. In particular, we first develop the deviation inequalities and the symmetrization inequality for the learning process. By using the resultant inequalities, we then obtain the risk bounds based on the covering number. Finally, based on the resulting risk bounds, we study the asymptotic convergence and the rate of convergence of the learning process for L´ vy process. Meanwhile, we also give a comparison to the related results under the samplee i.i.d. assumption. Keywords: L´ vy process, risk bound, deviation inequality, symmetrization inequality, statistical e learning theory, time-dependent
D. Applebaum. L´ vy Processes and Stochastic Calculus. Cambridge: Cambridge Press, 2004a. e D. Applebaum. L´ vy processes-from probability to finance and quantum groups. Notices of the e American Mathematical Society, 51:1336–1347, 2004b. O.E. Barndorff-Nielsen, T. Mikosch, and S.I. Resnick. L´ vy Processes: Theory and Applications. e Birkhauser, 2001. P.L. Bartlett, O. Bousquet, and S. Mendelson. Local rademacher complexities. Annals of Statistics, 33(4):1497–1537, 2005. 373 Z HANG AND TAO G. Bennett. Probability inequalities for the sum of independent random variables. Journal of the American Statistical Association, 57(297):33–45, 1962. M. Biguesh and A.B. Gershman. Training-based mimo channel estimation: a study of estimator tradeoffs and optimal training signals. IEEE Transactions on Signal Processing, 54(3):884–893, 2006. A. Bose, A. Dasgupta, and H. Rubin. A contemporary review and bibliography of infinitely divisible distributions and processes. The Indian Journal of Statistics, Series A, 64:763–819, 2002. O. Bousquet. A bennett concentration inequality and its application to suprema of empirical processes. Comptes Rendus Mathematique, 334(6):495–500, 2002. O. Bousquet, S. Boucheron, and G. Lugosi. Introduction to statistical learning theory. Advanced Lectures on Machine Learning, pages 169–207, 2004. N. Cesa-Bianchi and C. Gentile. Improved risk tail bounds for on-line algorithms. IEEE Transactions on Information Theory, 54(1):386–390, 2008. R. Cont and P. Tankov. Retrieving l´ vy processes from option prices: Regularization of an ill-posed e inverse problem. SIAM Journal on Control and Optimization, 45(1):1–25, 2006. T.E. Duncan. Mutual information for stochastic signals and l´ vy processes. IEEE Transactions on e Information Theory, 56(1):18–24, 2009. J.E. Figueroa-L´ pez and C. Houdr´ . Risk bounds for the non-parametric estimation of l´ vy proo e e cesses. Lecture Notes-Monograph Series, pages 96–116, 2006. C. Houdr´ . Remarks on deviation inequalities for functions of infinitely divisible random vectors. e Annals of probability, pages 1223–1237, 2002. C. Houdr´ and P. Marchal. Median, concentration and fluctuations for l´ vy processes. Stochastic e e Processes and their Applications, 118(5):852–863, 2008. C. Houdr´ , V. P´ rez-Abreu, and D. Surgailis. Interpolation, correlation identities, and inequalities e e for infinitely divisible variables. Journal of Fourier Analysis and Applications, 4(6):651–668, 1998. M. Jacobsen. Point Process Theory and Applications: Marked Point and Piecewise Deterministic Processes. Birkh¨ user Boston, 2005. a W. Jiang. On the uniform deviations of general empirical risks with unboundedness, dependence, and high dimensionality. Journal of Machine Learning Research, 10:977–996, 2009. O. Kallenberg. Canonical representations and convergence criteria for processes with interchangeable increments. Probability Theory and Related Fields, 27(1):23–36, 1973. O. Kallenberg et al. On symmetrically distributed random measures. Trans. Amer. Math. Soc, 202: 105–121, 1975. 374 ´ R ISK B OUNDS FOR L E VY P ROCESSES K. Kim. Financial time series forecasting using support vector machines. Neurocomputing, 55(1): 307–319, 2003. V. Koltchinskii. Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory, 47(5):1902–1914, 2001. A. Kyprianou. Introductory Lectures on Fluctuations of L´ vy Processes with Applications (Univere sitext). Springer, 2006. D.J. Love, R.W. Heath, V.K.N. Lau, D. Gesbert, B.D. Rao, and M. Andrews. An overview of limited feedback in wireless communication systems. IEEE Journal on Selected Areas Communications, 26(8):1341–1365, 2008. S. Mendelson. Rademacher averages and phase transitions in glivenko-cantelli classes. IEEE Transactions on Information Theory, 48(1):251–263, 2002. S. Mendelson. A few notes on statistical learning theory. Advanced Lectures on Machine Learning, pages 1–40, 2003. S. Mendelson. Lower bounds for the empirical minimization algorithm. IEEE Transactions on Information Theory, 54(8):3797–3803, 2008. M. Mohri and A. Rostamizadeh. Stability bounds for stationary φ-mixing and β-mixing processes. Journal of Machine Learning Research, 11:798–814, 2010. S. Mukherjee, E. Osuna, and F. Girosi. Nonlinear prediction of chaotic time series using support vector machines. In IEEE Workshop on Neural Networks for Signal Processing VII, pages 511– 520, 1997. A. M¨ ller. Integral probability metrics and their generating classes of functions. Advances in u Applied Probability, 29(2):429–443, 1997. A.B. Nobel and A. Dembo. A note on uniform laws of averages for dependent processes. Statistics & Probability Letters, 17(3):169–172, 1993. K.S. Pedersen, R. Duits, and M. Nielsen. On α kernels, l´ vy processes, and natural image statistics. e In Kimmel, Sochen, and Weickert, editors, Scale Space and PDE Methods in Computer Vision, pages 468–479, 2005. S.T. Rachev. Probability Metrics and the Stability of Stochastic Models. New York: Wiley, 1991. M. Reid and B. Williamson. Information, divergence and risk for binary experiments. Journal of Machine Learning Research, 12:731–817, 2011. M. Sanchez-Fernandez, M. de Prado-Cumplido, J. Arenas-Garcia, and F. Perez-Cruz. Svm multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Transactions on Signal Processing, 52(8):2298–2307, 2004. K. Sato. L´ vy Processes and Infinite Divisible Distributions (Cambridge Studies in Advanced Mathe ematics). USA: Cambridge University Press, 2004. 375 Z HANG AND TAO B.K. Sriperumbudur, K. Fukumizu, A. Gretton, B. Sch¨ lkopf, and G.R.G. Lanckriet. On the emo pirical estimation of integral probability metrics. Electronic Journal of Statistics, 6:1550–1599, 2012. A. Sutivong, M. Chiang, T.M. Cover, and Y.H. Kim. Channel capacity and state estimation for state-dependent gaussian channels. IEEE Transactions on Information Theory, 51(4):1486–1495, 2005. A.M. Tulino, A. Lozano, and S. Verd´ . Impact of antenna correlation on the capacity of multiu antenna channels. IEEE Transactions on Information Theory, 51(7):2491–2509, 2005. A. Van der Vaart and J. Wellner. Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, 1996. V.N. Vapnik. Statistical Learning Theory. Wiley, 1998. B. Yu. Rates of convergence for empirical processes of stationary mixing sequences. Annals of Probability, 22(1):94–116, 1994. C. Zhang and D. Tao. Risk bounds for l´ vy processes in the pac-learning framework. Journal of e Machine Learning Research-Proceedings Track, 9:948–955, 2010. C. Zhang and D. Tao. Generalization bound for infinitely divisible empirical process. J. Mach. Learn. Res.-Proc. Track, 15:864–872, 2011a. C. Zhang and D. Tao. Risk bounds for infinitely divisible distribution. In The 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), 2011b. V.M. Zolotarev. Probability metrics. Theory of Probability and its Application, 28(1):278–302, 1984. 376