jmlr jmlr2009 jmlr2009-64 jmlr2009-64-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Vitaly Feldman
Abstract: We study the properties of the agnostic learning framework of Haussler (1992) and Kearns, Schapire, and Sellie (1994). In particular, we address the question: is there any situation in which membership queries are useful in agnostic learning? Our results show that the answer is negative for distribution-independent agnostic learning and positive for agnostic learning with respect to a specific marginal distribution. Namely, we give a simple proof that any concept class learnable agnostically by a distribution-independent algorithm with access to membership queries is also learnable agnostically without membership queries. This resolves an open problem posed by Kearns et al. (1994). For agnostic learning with respect to the uniform distribution over {0, 1}n we show a concept class that is learnable with membership queries but computationally hard to learn from random examples alone (assuming that one-way functions exist). Keywords: agnostic learning, membership query, separation, PAC learning
D. Angluin. Queries and concept learning. Machine Learning, 2:319–342, 1988. A. Blum, M. Furst, M. Kearns, and R. J. Lipton. Cryptographic primitives based on hard learning problems. In Proceedings of International Cryptology Conference on Advances in Cryptology (CRYPTO), pages 278–291, 1993. A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM, 50(4):506–519, 2003. N. Bshouty. Exact learning via the monotone theory. Information and Computation, 123(1):146– 153, 1995. N. Bshouty, J. Jackson, and C. Tamon. More efficient PAC learning of DNF with membership queries under the uniform distribution. In Proceedings of COLT, pages 286–295, 1999. N. Bshouty, J. Jackson, and C. Tamon. More efficient PAC-learning of DNF with membership queries under the uniform distribution. Journal of Computer and System Sciences, 68(1):205– 234, 2004. R. Dudley. Central limit theorems for empirical measures. Annals of Probability, 6(6):899–929, 1978. A. Elbaz, H. Lee, R. Servedio, and A. Wan. Separating models of learning from correlated and uncorrelated data. Journal of Machine Learning Research, 8:277–290, 2007. V. Feldman. Optimal hardness results for maximizing agreements with monomials. In Proceedings of Conference on Computational Complexity (CCC), pages 226–236, 2006. V. Feldman. Attribute efficient and non-adaptive learning of parities and DNF expressions. Journal of Machine Learning Research, (8):1431–1460, 2007. V. Feldman, P. Gopalan, S. Khot, and A. Ponuswami. New results for learning noisy parities and halfspaces. In Proceedings of FOCS, pages 563–574, 2006. V. Feldman and S. Shah. Separating models of learning with faulty teachers. Theoretical Computer Science, doi:10.1016/j.tcs.2009.01.017, 2009. S. A. Goldman, S. Kwek, and S. D. Scott. Agnostic learning of geometric patterns. Journal of Computer and System Sciences, 62(1):123–151, 2001. O. Goldreich and L. Levin. A hard-core predicate for all one-way functions. In Proceedings of STOC, pages 25–32, 1989. 180 O N T HE P OWER OF M EMBERSHIP Q UERIES IN AGNOSTIC L EARNING O. Goldreich, S. Goldwasser, and S. Micali. How to construct random functions. Journal of the ACM, 33(4):792–807, 1986. P. Gopalan, A. Kalai, and A. Klivans. Agnostically learning decision trees. In Proceedings of STOC, pages 527–536, 2008. G. Guruswami and M. Sudan. List decoding algorithms for certain concatenated codes. In Proceedings of STOC, pages 181–190, 2000. V. Guruswami and P. Raghavendra. Hardness of learning halfspaces with noise. In Proceedings of FOCS, pages 543–552, 2006. S. Halevi and H. Krawczyk. Security under key-dependent inputs. In CCS ’07: Proceedings of the 14th ACM conference on Computer and communications security, pages 466–475, 2007. J. H˚ stad. Some optimal inapproximability results. Journal of the ACM, 48(4):798–859, 2001. a ISSN 0004-5411. J. H˚ stad, R. Impagliazzo, L. Levin, and M. Luby. A pseudorandom generator from any one-way a function. SIAM Journal on Computing, 28(4):1364–1396, 1999. D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1):78–150, 1992. ISSN 0890-5401. A. Kalai, A. Klivans, Y. Mansour, and R. Servedio. Agnostically learning halfspaces. SIAM Journal on Computing, 37(6):1777–1805, 2008a. A. Kalai, Y. Mansour, and E. Verbin. Agnostic boosting and parity learning. In Proceedings of STOC, pages 629–638, 2008b. M. Kearns and L. Valiant. Cryptographic limitations on learning boolean formulae and finite automata. Journal of the ACM, 41(1):67–95, 1994. M. Kearns, R. Schapire, and L. Sellie. Toward efficient agnostic learning. Machine Learning, 17 (2-3):115–141, 1994. M. Kharitonov. Cryptographic lower bounds for learnability of boolean functions on the uniform distribution. Journal of Computer and System Sciences, 50:600–610, 1995. E. Kushilevitz and Y. Mansour. Learning decision trees using the Fourier spectrum. SIAM Journal on Computing, 22(6):1331–1348, 1993. W. S. Lee, P. L. Bartlett, and R. C. Williamson. On efficient agnostic learning of linear combinations of basis functions. In Proceedings of COLT, pages 369–376, 1995. L. Levin. Randomness and non-determinism. Journal of Symbolic Logic, 58(3):1102–1103, 1993. N. Linial, Y. Mansour, and N. Nisan. Constant depth circuits, Fourier transform and learnability. Journal of the ACM, 40(3):607–620, 1993. 181 F ELDMAN Y. Mansour. Learning boolean functions via the fourier transform. In V. P. Roychodhury, K. Y. Siu, and A. Orlitsky, editors, Theoretical Advances in Neural Computation and Learning, pages 391–424. Kluwer, 1994. R. O’Donnell and R. Servedio. Learning monotone decision trees in polynomial time. In Proceedings of IEEE Conference on Computational Complexity, pages 213–225, 2006. D. Pollard. Convergence of Stochastic Processes. Springer-Verlag, 1984. L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984. J.H. van Lint. Introduction to Coding Theory. Springer, Berlin, 1998. V. Vapnik. Statistical Learning Theory. Wiley-Interscience, New York, 1998. V. Vapnik and A. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probab. and its Applications, 16(2):264–280, 1971. 182