nips nips2013 nips2013-216 nips2013-216-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Rohit Babbar, Ioannis Partalas, Eric Gaussier, Massih-Reza Amini
Abstract: We study in this paper flat and hierarchical classification strategies in the context of large-scale taxonomies. To this end, we first propose a multiclass, hierarchical data dependent bound on the generalization error of classifiers deployed in large-scale taxonomies. This bound provides an explanation to several empirical results reported in the literature, related to the performance of flat and hierarchical classifiers. We then introduce another type of bound targeting the approximation error of a family of classifiers, and derive from it features used in a meta-classifier to decide which nodes to prune (or flatten) in a large-scale taxonomy. We finally illustrate the theoretical developments through several experiments conducted on two widely used taxonomies. 1
[1] P. L. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002.
[2] S. Bengio, J. Weston, and D. Grangier. Label embedding trees for large multi-class tasks. In Advances in Neural Information Processing Systems 23, pages 163–171, 2010.
[3] L. Cai and T. Hofmann. Hierarchical document categorization with support vector machines. In Proceedings 13th ACM International Conference on Information and Knowledge Management (CIKM), pages 78–87. ACM, 2004.
[4] O. Dekel. Distribution-calibrated hierarchical classification. In Advances in Neural Information Processing Systems 22, pages 450–458. 2009.
[5] O. Dekel, J. Keshet, and Y. Singer. Large margin hierarchical classification. In Proceedings of the 21st International Conference on Machine Learning, pages 27–35, 2004.
[6] J. Deng, S. Satheesh, A. C. Berg, and F.-F. Li. Fast and balanced: Efficient label tree learning for large scale object recognition. In Advances in Neural Information Processing Systems 24, pages 567–575, 2011.
[7] S. Dumais and H. Chen. Hierarchical classification of web content. In Proceedings of the 23rd annual international ACM SIGIR conference, pages 256–263, 2000.
[8] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871–1874, 2008.
[9] T. Gao and D. Koller. Discriminative learning of relaxed hierarchy for large-scale visual recognition. In IEEE International Conference on Computer Vision (ICCV), pages 2072–2079, 2011.
[10] S. Gopal and Y. Y. A. Niculescu-Mizil. Regularization framework for large scale hierarchical classification. In Large Scale Hierarchical Classification, ECML/PKDD Discovery Challenge Workshop, 2012.
[11] S. Gopal, Y. Yang, B. Bai, and A. Niculescu-Mizil. Bayesian models for large-scale hierarchical classification. In Advances in Neural Information Processing Systems 25, 2012.
[12] Y. Guermeur. Sample complexity of classifiers taking values in Rq , application to multi-class SVMs. Communications in Statistics - Theory and Methods, 39, 2010.
[13] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer New York Inc., 2001.
[14] T.-Y. Liu, Y. Yang, H. Wan, H.-J. Zeng, Z. Chen, and W.-Y. Ma. Support vector machines classification with a very large-scale taxonomy. SIGKDD, 2005.
[15] H. Malik. Improving hierarchical SVMs by hierarchy flattening and lazy classification. In 1st Pascal Workshop on Large Scale Hierarchical Classification, 2009.
[16] F. Perronnin, Z. Akata, Z. Harchaoui, and C. Schmid. Towards good practice in large-scale learning for image classification. In Computer Vision and Pattern Recognition, pages 3482– 3489, 2012.
[17] M. Schervish. Theory of Statistics. Springer Series in Statistics. Springer New York Inc., 1995.
[18] X. Wang and B.-L. Lu. Flatten hierarchies for large-scale hierarchical text categorization. In 5th International Conference on Digital Information Management, pages 139–144, 2010.
[19] K. Q. Weinberger and O. Chapelle. Large margin taxonomy embedding for document categorization. In Advances in Neural Information Processing Systems 21, pages 1737–1744, 2008.
[20] Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings of the 22nd annual International ACM SIGIR conference, pages 42–49. ACM, 1999.
[21] J. Zhang, L. Tang, and H. Liu. Automatically adjusting content taxonomies for hierarchical classification. In Proceedings of the 4th Workshop on Text Mining, 2006. 9