acl acl2012 acl2012-146 acl2012-146-reference knowledge-graph by maker-knowledge-mining

146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization

Source: pdf

Author: Alessandro Moschitti ; Qi Ju ; Richard Johansson

Abstract: In this paper, we encode topic dependencies in hierarchical multi-label Text Categorization (TC) by means of rerankers. We represent reranking hypotheses with several innovative kernels considering both the structure of the hierarchy and the probability of nodes. Additionally, to better investigate the role ofcategory relationships, we consider two interesting cases: (i) traditional schemes in which node-fathers include all the documents of their child-categories; and (ii) more general schemes, in which children can include documents not belonging to their fathers. The extensive experimentation on Reuters Corpus Volume 1 shows that our rerankers inject effective structural semantic dependencies in multi-classifiers and significantly outperform the state-of-the-art.

reference text

Nir Ailon and Mehryar Mohri. 2010. Preference-based learning to rank. Machine Learning. Maria-Florina Balcan, Nikhil Bansal, Alina Beygelzimer, Don Coppersmith, John Langford, and Gregory B. Sorkin. 2008. Robust reductions from ranking to classification. Machine Learning, 72(1-2): 139–153. Razvan Bunescu and Raymond Mooney. 2005. A shortest path dependency kernel for relation extraction. In Proceedings of HLT and EMNLP, pages 724–731, Vancouver, British Columbia, Canada, October. Nicola Cancedda, Eric Gaussier, Cyril Goutte, and Jean Michel Renders. 2003. Word sequence kernels. Journal of Machine Learning Research, 3: 1059–1082. Michael Collins and Nigel Duffy. 2002. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In Proceedings of ACL’02, pages 263–270. Chad Cumby and Dan Roth. 2003. On kernel methods for relational learning. In Proceedings of ICML 2003. Hal Daum e´ III and Daniel Marcu. 2004. Np bracketing by maximum entropy tagging and SVM reranking. In Proceedings of EMNLP’04. Susan T. Dumais and Hao Chen. 2000. Hierarchical classification of web content. In Nicholas J. Belkin, Peter Ingwersen, and Mun-Kew Leong, editors, Proceedings of SIGIR-00, 23rd ACM International Conference on Research and Development in Information Retrieval, pages 256–263, Athens, GR. ACM Press, New York, US. T. Finley and T. Joachims. 2007. Parameter learning for loopy markov random fields with structural support vector machines. In ICML Workshop on Constrained Optimization and Structured Output Spaces. Alfio Gliozzo, Claudio Giuliano, and Carlo Strapparava. 2005. Domain kernels for word sense disambiguation. In Proceedings of ACL’05, pages 403–410. Thorsten Joachims. 1999. Making large-scale SVM learning practical. Advances in Kernel Methods Support Vector Learning, 13. Taku Kudo and Yuji Matsumoto. 2003. Fast methods for kernel-based text analysis. In Proceedings of ACL’03. Taku Kudo, Jun Suzuki, and Hideki Isozaki. 2005. Boosting-based parse reranking with subtree features. In Proceedings of ACL’05. T. Lavergne, O. Capp e´, and F. Yvon. 2010. Practical very large scale CRFs. In Proc. of ACL, pages 504–5 13. D. D. Lewis, Y. Yang, T. Rose, and F. Li. 2004. Rcv1: A new benchmark collection for text categorization re– search. The Journal of Machine Learning Research, (5):361–397. 767 Andrew McCallum, Ronald Rosenfeld, Tom M. Mitchell, and Andrew Y. Ng. 1998. Improving text classification by shrinkage in a hierarchy of classes. In ICML, pages 359–367. Alessandro Moschitti. 2006a. Efficient convolution kernels for dependency and constituent syntactic trees. In Proceedings of ECML’06. Alessandro Moschitti. 2006b. Making tree kernels practical for natural language learning. In Proccedings of EACL’06. S. Riezler and A. Vasserman. 2010. Incremental feature selection and l1 regularization for relaxed maximumentropy modeling. In EMNLP. Ryan Rifkin and Aldebaro Klautau. 2004. In defense of one-vs-all classification. J. Mach. Learn. Res., 5: 101 141, December. Juho Rousu, Craig Saunders, Sandor Szedmak, and John Shawe-Taylor. 2006. Kernel-based learning of hierarchical multilabel classification models. The Journal of Machine Learning Research, (7): 1601–1626. John Shawe-Taylor and Nello Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press. Libin Shen, Anoop Sarkar, and Aravind k. Joshi. 2003. Using LTAG Based Features in Parse Reranking. In Empirical Methods for Natural Language Processing (EMNLP), pages 89–96, Sapporo, Japan. Ivan Titov and James Henderson. 2006. Porting statistical parsers with data-defined kernels. In Proceedings of CoNLL-X. Kristina Toutanova, Penka Markova, and Christopher Manning. 2004. The Leaf Path Projection View of Parse Trees: Exploring String Kernels for HPSG Parse Selection. In Proceedings of EMNLP 2004. Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, and Yasemin Altun. 2005. Large margin methods for structured and interdependent output variables. J. Machine Learning Reserach. , 6: 1453–1484, December. Dmitry Zelenko, Chinatsu Aone, and Anthony Richardella. 2002. Kernel methods for relation extraction. In Proceedings of EMNLP-ACL, pages 181–201. Min Zhang, Jie Zhang, and Jian Su. 2006. Exploring Syntactic Features for Relation Extraction using a Convolution tree kernel. In Proceedings of NAACL.