nips nips2003 nips2003-41 nips2003-41-reference knowledge-graph by maker-knowledge-mining

41 nips-2003-Boosting versus Covering


Source: pdf

Author: Kohei Hatano, Manfred K. Warmuth

Abstract: We investigate improvements of AdaBoost that can exploit the fact that the weak hypotheses are one-sided, i.e. either all its positive (or negative) predictions are correct. In particular, for any set of m labeled examples consistent with a disjunction of k literals (which are one-sided in this case), AdaBoost constructs a consistent hypothesis by using O(k 2 log m) iterations. On the other hand, a greedy set covering algorithm finds a consistent hypothesis of size O(k log m). Our primary question is whether there is a simple boosting algorithm that performs as well as the greedy set covering. We first show that InfoBoost, a modification of AdaBoost proposed by Aslam for a different purpose, does perform as well as the greedy set covering algorithm. We then show that AdaBoost requires Ω(k 2 log m) iterations for learning k-literal disjunctions. We achieve this with an adversary construction and as well as in simple experiments based on artificial data. Further we give a variant called SemiBoost that can handle the degenerate case when the given examples all have the same label. We conclude by showing that SemiBoost can be used to produce small conjunctions as well. 1


reference text

[Asl00] J. A. Aslam. Improving algorithms for boosting. In Proc. 13th Annu. Conference on Comput. Learning Theory, pages 200–207, 2000. [Fre95] Y. Freund. Boosting a weak learning algorithm by majority. Inform. Comput., 121(2):256–285, September 1995. Also appeared in COLT90. [FS97] Y. Freund and R. E. Schapire:. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119–139, 1997. [KW99] Jyrki Kivinen and Manfred K. Warmuth. Boosting as entropy projection. In Proc. 12th Annu. Conf. on Comput. Learning Theory, pages 134–144. ACM Press, New York, NY, 1999. [Laf99] J. Lafferty. Additive models, boosting, and inference for generalized divergences. In Proc. 12th Annu. Conf. on Comput. Learning Theory, pages 125–133. ACM, 1999. [MG92] A. A. Razborov M. Goldmann, J. Hastad. Majority gates vs. general weighted threshold gates. Journal of Computation Complexity, 1(4):277– 300, 1992. [Nat91] B. K. Natarajan. Machine Learning: A Theoretical Approach. Morgan Kaufmann, San Mateo, CA, 1991. [Riv87] R. L. Rivest. Learning decision lists. Machine Learning, 2:229–246, 1987. [RW02] G. R¨tsch and M. K. Warmuth. Maximizing the margin with boosting. a In Proceedings of the 15th Annual Conference on Computational Learning Theory, pages 334–350. Springer, July 2002. [SS99] Robert E. Schapire and Yoram Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3):297–336, 1999.