nips nips2013 nips2013-216 nips2013-216-reference knowledge-graph by maker-knowledge-mining

216 nips-2013-On Flat versus Hierarchical Classification in Large-Scale Taxonomies

Source: pdf

Author: Rohit Babbar, Ioannis Partalas, Eric Gaussier, Massih-Reza Amini

Abstract: We study in this paper ﬂat and hierarchical classiﬁcation strategies in the context of large-scale taxonomies. To this end, we ﬁrst propose a multiclass, hierarchical data dependent bound on the generalization error of classiﬁers deployed in large-scale taxonomies. This bound provides an explanation to several empirical results reported in the literature, related to the performance of ﬂat and hierarchical classiﬁers. We then introduce another type of bound targeting the approximation error of a family of classiﬁers, and derive from it features used in a meta-classiﬁer to decide which nodes to prune (or ﬂatten) in a large-scale taxonomy. We ﬁnally illustrate the theoretical developments through several experiments conducted on two widely used taxonomies. 1

reference text

[1] P. L. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002.

[2] S. Bengio, J. Weston, and D. Grangier. Label embedding trees for large multi-class tasks. In Advances in Neural Information Processing Systems 23, pages 163–171, 2010.

[3] L. Cai and T. Hofmann. Hierarchical document categorization with support vector machines. In Proceedings 13th ACM International Conference on Information and Knowledge Management (CIKM), pages 78–87. ACM, 2004.

[4] O. Dekel. Distribution-calibrated hierarchical classiﬁcation. In Advances in Neural Information Processing Systems 22, pages 450–458. 2009.

[5] O. Dekel, J. Keshet, and Y. Singer. Large margin hierarchical classiﬁcation. In Proceedings of the 21st International Conference on Machine Learning, pages 27–35, 2004.

[6] J. Deng, S. Satheesh, A. C. Berg, and F.-F. Li. Fast and balanced: Efﬁcient label tree learning for large scale object recognition. In Advances in Neural Information Processing Systems 24, pages 567–575, 2011.

[7] S. Dumais and H. Chen. Hierarchical classiﬁcation of web content. In Proceedings of the 23rd annual international ACM SIGIR conference, pages 256–263, 2000.

[8] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classiﬁcation. Journal of Machine Learning Research, 9:1871–1874, 2008.

[9] T. Gao and D. Koller. Discriminative learning of relaxed hierarchy for large-scale visual recognition. In IEEE International Conference on Computer Vision (ICCV), pages 2072–2079, 2011.

[10] S. Gopal and Y. Y. A. Niculescu-Mizil. Regularization framework for large scale hierarchical classiﬁcation. In Large Scale Hierarchical Classiﬁcation, ECML/PKDD Discovery Challenge Workshop, 2012.

[11] S. Gopal, Y. Yang, B. Bai, and A. Niculescu-Mizil. Bayesian models for large-scale hierarchical classiﬁcation. In Advances in Neural Information Processing Systems 25, 2012.

[12] Y. Guermeur. Sample complexity of classiﬁers taking values in Rq , application to multi-class SVMs. Communications in Statistics - Theory and Methods, 39, 2010.

[13] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer New York Inc., 2001.

[14] T.-Y. Liu, Y. Yang, H. Wan, H.-J. Zeng, Z. Chen, and W.-Y. Ma. Support vector machines classiﬁcation with a very large-scale taxonomy. SIGKDD, 2005.

[15] H. Malik. Improving hierarchical SVMs by hierarchy ﬂattening and lazy classiﬁcation. In 1st Pascal Workshop on Large Scale Hierarchical Classiﬁcation, 2009.

[16] F. Perronnin, Z. Akata, Z. Harchaoui, and C. Schmid. Towards good practice in large-scale learning for image classiﬁcation. In Computer Vision and Pattern Recognition, pages 3482– 3489, 2012.

[17] M. Schervish. Theory of Statistics. Springer Series in Statistics. Springer New York Inc., 1995.

[18] X. Wang and B.-L. Lu. Flatten hierarchies for large-scale hierarchical text categorization. In 5th International Conference on Digital Information Management, pages 139–144, 2010.

[19] K. Q. Weinberger and O. Chapelle. Large margin taxonomy embedding for document categorization. In Advances in Neural Information Processing Systems 21, pages 1737–1744, 2008.

[20] Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings of the 22nd annual International ACM SIGIR conference, pages 42–49. ACM, 1999.

[21] J. Zhang, L. Tang, and H. Liu. Automatically adjusting content taxonomies for hierarchical classiﬁcation. In Proceedings of the 4th Workshop on Text Mining, 2006. 9