nips nips2012 nips2012-16 nips2012-16-reference knowledge-graph by maker-knowledge-mining

16 nips-2012-A Polynomial-time Form of Robust Regression


Source: pdf

Author: Ozlem Aslan, Dale Schuurmans, Yao-liang Yu

Abstract: Despite the variety of robust regression methods that have been developed, current regression formulations are either NP-hard, or allow unbounded response to even a single leverage point. We present a general formulation for robust regression—Variational M-estimation—that unifies a number of robust regression methods while allowing a tractable approximation strategy. We develop an estimator that requires only polynomial-time, while achieving certain robustness and consistency guarantees. An experimental evaluation demonstrates the effectiveness of the new estimation approach compared to standard methods. 1


reference text

[1] T. Bernholt. Robust estimators are hard to compute. Technical Report 52/2005, SFB475, U. Dortmund, 2005.

[2] R. Nunkesser and O. Morell. An evolutionary algorithm for robust regression. Computational Statistics and Data Analysis, 54:3242–3248, 2010.

[3] R. Maronna, R. Martin, and V. Yohai. Robust Statistics: Theory and Methods. Wiley, 2006.

[4] P. Huber and E. Ronchetti. Robust Statistics. Wiley, 2nd edition, 2009.

[5] A. Christmann and I. Steinwart. Consistency and robustness of kernel-based regression in convex risk minimization. Bernoulli, 13(3):799–819, 2007.

[6] A. Christmann, A. Van Messem, and I. Steinwart. On consistency and robustness properties of support vector machines for heavy-tailed distributions. Statistics and Its Interface, 2:311–327, 2009.

[7] P. Rousseeuw and A. Leroy. Robust Regression and Outlier Detection. Wiley, 1987.

[8] M. Black and A. Rangarajan. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal of Computer Vision, 19(1): 57–91, 1996.

[9] Y. Yu, M. Yang, L. Xu, M. White, and D. Schuurmans. Relaxed clipping: A global training method for robust regression and classification. In Advances in Neural Information Processings Systems (NIPS), 2010.

[10] A. Bental, L. El Ghaoui, and A. Nemirovski. Robust Optimization. Princeton Series in Applied Mathematics. Princeton University Press, October 2009.

[11] H. Xu, C. Caramanis, and S. Mannor. Robust regression and Lasso. In Advances in Neural Information Processing Systems (NIPS), volume 21, pages 1801–1808, 2008.

[12] H. Xu, C. Caramanis, and S. Mannor. Robustness and regularization of support vector machines. Journal of Machine Learning Research, 10:1485–1510, 2009.

[13] S. Mukherjee, P. Niyogi, T. Poggio, and R. Rifkin. Learning theory: Stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Advances in Computational Mathematics, 25(1-3):161–193, 2006.

[14] G. Kimeldorf and G. Wahba. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics, 41(2):495–502, 1970.

[15] I. Steinwart and A. Christmann. Support Vector Machines. Springer, 2008.

[16] Aad W. van der Vaart and Jon A. Wellner. Weak Convergence and Empirical Processes. Springer, 1996.

[17] F. Hampel, E. Ronchetti, P. Rousseeuw, and W. Stahel. Robust Statistics: The Approach Based on Influence Functions. Wiley, 1986.

[18] D. Donoho and P. Huber. The notion of breakdown point. In A Festschrift for Erich L. Lehmann, pages 157–184. Wadsworth, 1983.

[19] P. Davies and U. Gather. The breakdown point—examples and counterexamples. REVSTAT Statistical Journal, 5(1):1–17, 2007.

[20] Y. Nesterov and A. Nemiroviskii. Interior-point Polynomial Methods in Convex Programming. SIAM, 1994.

[21] Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, 2003.

[22] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge U. Press, 2004.

[23] J. Peng and Y. Wei. Approximating k-means-type clustering via semidefinite programming. SIAM Journal on Optimization, 18(1):186–205, 2007.

[24] P. Rousseeuw and K. Van Driessen. Computing LTS regression for large data sets. Data Mining and Knowledge Discovery, 12(1):29–45, 2006.

[25] R. Horn and C. Johnson. Matrix Analysis. Cambridge, 1985. 9