nips nips2012 nips2012-207 nips2012-207-reference knowledge-graph by maker-knowledge-mining

207 nips-2012-Mandatory Leaf Node Prediction in Hierarchical Multilabel Classification

Source: pdf

Author: Wei Bi, James T. Kwok

Abstract: In hierarchical classiﬁcation, the prediction paths may be required to always end at leaf nodes. This is called mandatory leaf node prediction (MLNP) and is particularly useful when the leaf nodes have much stronger semantic meaning than the internal nodes. However, while there have been a lot of MLNP methods in hierarchical multiclass classiﬁcation, performing MLNP in hierarchical multilabel classiﬁcation is much more difﬁcult. In this paper, we propose a novel MLNP algorithm that (i) considers the global hierarchy structure; and (ii) can be used on hierarchies of both trees and DAGs. We show that one can efﬁciently maximize the joint posterior probability of all the node labels by a simple greedy algorithm. Moreover, this can be further extended to the minimization of the expected symmetric loss. Experiments are performed on a number of real-world data sets with tree- and DAG-structured label hierarchies. The proposed method consistently outperforms other hierarchical and ﬂat multilabel classiﬁcation methods. 1

reference text

[1] C. Vens, J. Struyf, L. Schietgat, S. Dvzeroski, and H. Blockeel. Decision trees for hierarchical multi-label classiﬁcation. Machine Learning, 73:185–214, 2008.

[2] J.J. Burred and A. Lerch. A hierarchical approach to automatic musical genre classiﬁcation. In Proceedings of the 6th International Conference on Digital Audio Effects, 2003.

[3] N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Incremental algorithms for hierarchical classiﬁcation. Journal of Machine Learning Research, 7:31–54, 2006.

[4] C.N. Silla and A.A. Freitas. A survey of hierarchical classiﬁcation across different application domains. Data Mining and Knowledge Discovery, 22(1-2):31–72, 2011.

[5] Z. Barutcuoglu and O.G. Troyanskaya. Hierarchical multi-label prediction of gene function. Bioinformatics, 22:830–836, 2006.

[6] K. Punera, S. Rajan, and J. Ghosh. Automatically learning document taxonomies for hierarchical classiﬁcation. In Proceedings of the 14th International Conference on World Wide Web, pages 1010–1011, 2005.

[7] M.-L. Zhang and K. Zhang. Multi-label learning by exploiting label dependency. In Proceedings of the 16th International Conference on Knowledge Discovery and Data Mining, pages 999–1008, 2010.

[8] S. Bengio, J. Weston, and D. Grangier. Label embedding trees for large multi-class tasks. In Advances in Neural Information Processing Systems 23, pages 163–171. 2010.

[9] J. Deng, S. Satheesh, A.C. Berg, and L. Fei-Fei. Fast and balanced: Efﬁcient label tree learning for large scale object recognition. In Advances in Neural Information Processing Systems 24, pages 567–575. 2011.

[10] J. Rousu, C. Saunders, S. Szedmak, and J. Shawe-Taylor. Kernel-based learning of hierarchical multilabel classiﬁcation models. Journal of Machine Learning Research, 7:1601–1626, 2006.

[11] W. Bi and J.T. Kwok. Multi-label classiﬁcation on tree- and DAG-structured hierarchies. In Proceedings of the 28th International Conference on Machine Learning, pages 17–24, 2011.

[12] N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Hierarchical classiﬁcation: Combining Bayes with SVM. In Proceedings of the 23rd International Conference on Machine Learning, pages 177–184, 2006.

[13] L. Tang, S. Rajan, and V.K. Narayanan. Large scale multi-label classiﬁcation via metalabeler. In Proceedings of the 18th International Conference on World Wide Web, pages 211–220, 2009.

[14] R. Cerri, A. C. P. L. F. de Carvalho, and A. A. Freitas. Adapting non-hierarchical multilabel classiﬁcation methods for hierarchical multilabel classiﬁcation. Intelligent Data Analysis, 15:861–887, 2011.

[15] G. Tsoumakas and I. Vlahavas. Random k-labelsets: An ensemble method for multilabel classiﬁcation. In Proceedings of the 18th European Conference on Machine Learning, pages 406–417, Warsaw, Poland, 2007.

[16] N. Cesa-Bianchi, C. Gentile, A. Tironi, and L. Zaniboni. Incremental algorithms for hierarchical classiﬁcation. In Advances in Neural Information Processing Systems 17, pages 233–240. 2005.

[17] J.H. Zaragoza, L.E. Sucar, and EF Morales. Bayesian chain classiﬁers for multidimensional classiﬁcation. In Twenty-Second International Joint Conference on Artiﬁcial Intelligence, pages 2192–2197, 2011.

[18] R.G. Baraniuk, V. Cevher, M.F. Duarte, and C. Hegde. Model-based compressive sensing. IEEE Transactions on Information Theory, 56:1982–2001, 2010.

[19] S.E. Shimony. Finding maps for belief networks is NP-hard. Artiﬁcial Intelligence, 68:399–410, 1994.

[20] C. Varin, N. Reid, and D. Firth. An overview of composite likelihood methods. Statistica Sinica, 21:5–42, 2011.

[21] Y. Zhang and J. Schneider. A composite likelihood view for multi-label classiﬁcation. In Proceedings of the 15th International Conference on Artiﬁcial Intelligence and Statistics, pages 1407–1415, 2012.

[22] J. Zhou, J. Chen, and J. Ye. MALSAR: Multi-tAsk Learning via StructurAl Regularization. Arizona State University, 2012.

[23] G. Tsoumakas, I. Katakis, and I. Vlahavas. Mining multi-label data. In Data Mining and Knowledge Discovery Handbook, pages 667–685. Springer, 2010. 9