jmlr jmlr2013 jmlr2013-6 jmlr2013-6-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xin Tong
Abstract: The Neyman-Pearson (NP) paradigm in binary classification treats type I and type II errors with different priorities. It seeks classifiers that minimize type II error, subject to a type I error constraint under a user specified level α. In this paper, plug-in classifiers are developed under the NP paradigm. Based on the fundamental Neyman-Pearson Lemma, we propose two related plug-in classifiers which amount to thresholding respectively the class conditional density ratio and the regression function. These two classifiers handle different sampling schemes. This work focuses on theoretical properties of the proposed classifiers; in particular, we derive oracle inequalities that can be viewed as finite sample versions of risk bounds. NP classification can be used to address anomaly detection problems, where asymmetry in errors is an intrinsic property. As opposed to a common practice in anomaly detection that consists of thresholding normal class density, our approach does not assume a specific form for anomaly distributions. Such consideration is particularly necessary when the anomaly class density is far from uniformly distributed. Keywords: plug-in approach, Neyman-Pearson paradigm, nonparametric statistics, oracle inequality, anomaly detection
M. Agyemang, K. Barker, and R. Alhajj. A comprehensive survey of numeric and symbolic outlier mining techniques. Intelligent Data Analysis, 6:521–538, 2006. J. Audibert and A. Tsybakov. Fast learning rates for plug-in classifiers under the margin condition. Annals of Statistics, 35:608–633, 2007. G. Blanchard, G. Lee, and G. Scott. Semi-supervised novelty detection. Journal of Machine Learning Research, 11:2973–3009, 2010. A. Cannon, J. Howse, D. Hush, and C. Scovel. Learning with the neyman-pearson and min-max criteria. Technical Report LA-UR-02-2951, 2002. D. Casasent and X. Chen. Radial basis function neural networks for nonlinear fisher discrimination and neyman-pearson classification. Neural Networks, 16(5-6):529 – 535, 2003. V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, 09:1–72, 2009. C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pages 973–978, 2001. E. Gin´ , V. Koltchinskii, and L. Sakhanenko. Kernel density estimators: convergence in distribution e for weighted sup norms. Probability Theory and Related Fields, 130:167–198, 2004. 3038 N EYMAN -P EARSON C LASSIFICATION M. Han, D. Chen, and Z. Sun. Analysis to Neyman-Pearson classification with convex loss function. Analysis in Theory and Applications, 24(1):18–28, 2008. V. Hodge and J. Austin. A survey of outlier detection methodologies. Artificial Intelligence Rivew, 2:85–126, 2004. E. L. Lehmann and J. P. Romano. Testing Statistical Hypotheses. Springer Texts in Statistics. Springer, New York, third edition, 2005. ISBN 0-387-98864-5. J. Lei, A. Rinaldo, and L. Wasserman. A conformal prediction approach to explore functional data. Annals of Mathematics and Artificial Intelligence, 2013. O. Lepski. Multivariate density estimation under sup-norm loss: oracle approach, adaptation and independence structure. Annals of Statistics, 41(2):1005–1034, 2013. E. Mammen and A.B. Tsybakov. Smooth discrimination analysis. Annals of Statistics, 27:1808– 1829, 1999. M. Markou and S. Singh. Novelty detection: a review-part 1: statistical approahces. Signal Processing, 12:2481–2497, 2003a. M. Markou and S. Singh. Novelty detection: a review-part 2: network-based approahces. Signal Processing, 12:2499–2521, 2003b. A. Patcha and J.M. Park. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer Networks, 12:3448–3470, 2007. W. Polonik. Measuring mass concentrations and estimating density contour clusters-an excess mass approach. Annals of Statistics, 23:855–881, 1995. P. Rigollet and X. Tong. Neyman-pearson classification, convexity and stochastic constraints. Journal of Machine Learning Research, 12:2831–2855, 2011. P. Rigollet and R. Vert. Optimal rates for plug-in estimators of density level sets. Bernoulli, 15(4): 1154–1178, 2009. C. Scott. Comparison and design of neyman-pearson classifiers. Unpublished, 2005. C. Scott. Performance measures for Neyman-Pearson classification. IEEE Transactions on Information Theory, 53(8):2852–2863, 2007. C. Scott and R. Nowak. A neyman-pearson approach to statistical learning. IEEE Transactions on Information Theory, 51(11):3806–3819, 2005. B. Tarigan and S. van de Geer. Classifiers of support vector machine type with l1 complexity regularization. Bernoulli, 12:1045–1076, 2006. A. Tsybakov. Optimal aggregation of classifiers in statistical learning. Annals of Statistics, 32: 135–166, 2004. A. Tsybakov. Introduction to Nonparametric Estimation. Springer, 2009. 3039 T ONG A. Tsybakov and S. van de Geer. Square root penalty: Adaptation to the margin in classification and in edge estimation. Annals of Statistics, 33:1203–1224, 2005. D. Wied and R. Weiβbach. Consistency of the kernel density estimator: a survey. Statistical Papers, 53(1):1–21, 2010. Y. Yang. Minimax nonparametric classification-part i: rates of convergence. IEEE Transaction Information Theory, 45:2271–2284, 1999. B. Zadrozny, J. Langford, and N. Abe. Cost-sensitive learning by cost-proportionate example weighting. IEEE International Conference on Data Mining, page 435, 2003. 3040