nips nips2013 nips2013-230 nips2013-230-reference knowledge-graph by maker-knowledge-mining

230 nips-2013-Online Learning with Costly Features and Labels

Source: pdf

Author: Navid Zolghadr, Gabor Bartok, Russell Greiner, András György, Csaba Szepesvari

Abstract: This paper introduces the online probing problem: In each round, the learner is able to purchase the values of a subset of feature values. After the learner uses this information to come up with a prediction for the given round, he then has the option of paying to see the loss function that he is evaluated against. Either way, the learner pays for both the errors of his predictions and also whatever he chooses to observe, including the cost of observing the loss function for the given round and the cost of the observed features. We consider two variations of this problem, depending on whether the learner can observe the label for free or not. We provide algorithms and upper and lower bounds on the regret for both variants. We show that a positive cost for observing the label signiﬁcantly increases the regret of the problem. 1

reference text

Agarwal, A. and Duchi, J. C. (2011). Distributed delayed stochastic optimization. In Shawe-Taylor, J., Zemel, R. S., Bartlett, P. L., Pereira, F. C. N., and Weinberger, K. Q., editors, NIPS, pages 873–881. Auer, P., Cesa-Bianchi, N., Freund, Y., and Schapire, R. E. (2002). The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1):48–77. Bart´ k, G. (2012). The role of information in online learning. PhD thesis, Department of Computing o Science, University of Alberta. Cesa-Bianchi, N. and Lugosi, G. (2006). Prediction, learning, and games. Cambridge Univ Pr. Cesa-Bianchi, N., Lugosi, G., and Stoltz, G. (2005). Minimizing regret with label efﬁcient prediction. IEEE Transactions on Information Theory, 51(6):2152–2162. Cesa-Bianchi, N., Lugosi, G., and Stoltz, G. (2006). Regret minimization under partial monitoring. Math. Oper. Res., 31(3):562–580. Cesa-Bianchi, N., Shalev-Shwartz, S., and Shamir, O. (2010). Efﬁcient learning with partially observed attributes. CoRR, abs/1004.4421. Dekel, O., Shamir, O., and Xiao, L. (2010). Learning to classify with missing and corrupted features. Machine Learning, 81(2):149–178. Joulani, P., Gy¨ rgy, A., and Szepesv´ ri, C. (2013). Online learning under delayed feedback. In 30th o a International Conference on Machine Learning, Atlanta, GA, USA. Kapoor, A. and Greiner, R. (2005). Learning and classifying under hard budgets. In European Conference on Machine Learning (ECML), pages 166–173. Lizotte, D., Madani, O., and Greiner, R. (2003). Budgeted learning of naive-Bayes classiﬁers. In Conference on Uncertainty in Artiﬁcial Intelligence (UAI). Mannor, S. and Shamir, O. (2011). From bandits to experts: On the value of side-observations. CoRR, abs/1106.2436. Mesterharm, C. (2005). On-line learning with delayed label feedback. In Proceedings of the 16th international conference on Algorithmic Learning Theory, ALT’05, pages 399–413, Berlin, Heidelberg. Springer-Verlag. Rostamizadeh, A., Agarwal, A., and Bartlett, P. L. (2011). Learning with missing features. In UAI, pages 635–642. Settles, B. (2009). Active learning literature survey. Technical report. Weinberger, M. J. and Ordentlich, E. (2006). On delayed prediction of individual sequences. IEEE Trans. Inf. Theor., 48(7):1959–1976. 9