nips nips2010 nips2010-290 nips2010-290-reference knowledge-graph by maker-knowledge-mining

290 nips-2010-t-logistic regression


Source: pdf

Author: Nan Ding, S.v.n. Vishwanathan

Abstract: We extend logistic regression by using t-exponential families which were introduced recently in statistical physics. This gives rise to a regularized risk minimization problem with a non-convex loss function. An efficient block coordinate descent optimization scheme can be derived for estimating the parameters. Because of the nature of the loss function, our algorithm is tolerant to label noise. Furthermore, unlike other algorithms which employ non-convex loss functions, our algorithm is fairly robust to the choice of initial values. We verify both these observations empirically on a number of synthetic and real datasets. 1


reference text

[1] Choon Hui Teo, S. V. N. Vishwanthan, Alex J. Smola, and Quoc V. Le. Bundle methods for regularized risk minimization. J. Mach. Learn. Res., 11:311–365, January 2010.

[2] S. Ben-David, N. Eiron, and P.M. Long. On the difficulty of approximately maximizing agreements. J. Comput. System Sci., 66(3):496–514, 2003.

[3] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, England, 2004.

[4] Phil Long and Rocco Servedio. Random classification noise defeats all convex potential boosters. Machine Learning Journal, 78(3):287–304, 2010.

[5] Yoav Freund. A more robust boosting algorithm. Technical Report Arxiv/0905.2138, Arxiv, May 2009.

[6] J. Naudts. Deformed exponentials and logarithms in generalized thermostatistics. Physica A, 316:323–334, 2002. URL http://arxiv.org/pdf/cond-mat/0203489.

[7] J. Naudts. Generalized thermostatistics based on deformed exponential and logarithmic functions. Physica A, 340:32–40, 2004.

[8] J. Naudts. Generalized thermostatistics and mean-field theory. Physica A, 332:279–300, 2004.

[9] J. Naudts. Estimators, escort proabilities, and φ-exponential families in statistical physics. Journal of Inequalities in Pure and Applied Mathematics, 5(4), 2004.

[10] C. Tsallis. Possible generalization of boltzmann-gibbs statistics. J. Stat. Phys., 52, 1988.

[11] Christopher Bishop. Pattern Recognition and Machine Learning. Springer, 2006.

[12] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning. Springer, New York, 2 edition, 2009.

[13] Timothy D. Sears. Generalized Maximum Entropy, Convexity, and Machine Learning. PhD thesis, Australian National University, 2008.

[14] Andre Sousa and Constantino Tsallis. Student’s t- and r-distributions: Unified derivation from an entropic variational principle. Physica A, 236:52–57, 1994.

[15] A O’hagan. On outlier rejection phenomena in bayes inference. Royal Statistical Society, 41 (3):358–367, 1979.

[16] Kenneth L. Lange, Roderick J. A. Little, and Jeremy M. G. Taylor. Robust statistical modeling using the t distribution. Journal of the American Statistical Association, 84(408):881–896, 1989.

[17] J. Vanhatalo, P. Jylanki, and A. Vehtari. Gaussian process regression with student-t likelihood. In Neural Information Processing System, 2009.

[18] Takahito Kuno, Yasutoshi Yajima, and Hiroshi Konno. An outer approximation method for minimizing the product of several convex functions on a convex set. Journal of Global Optimization, 3(3):325–335, September 1993.

[19] David Mease and Abraham Wyner. Evidence contrary to the statistical view of boosting. J. Mach. Learn. Res., 9:131–156, February 2008.

[20] R. Collobert, F.H. Sinz, J. Weston, and L. Bottou. Trading convexity for scalability. In W.W. Cohen and A. Moore, editors, Machine Learning, Proceedings of the Twenty-Third International Conference (ICML 2006), pages 201–208. ACM, 2006.

[21] J. Nocedal and S. J. Wright. Numerical Optimization. Springer Series in Operations Research. Springer, 1999.

[22] C.C. Chang and C.J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm.

[23] Fabian Sinz. UniverSVM: Support Vector Machine with Large Scale CCCP Functionality, 2006. Software available at http://www.kyb.mpg.de/bs/people/fabee/universvm. html. 9