nips nips2008 nips2008-114 nips2008-114-reference knowledge-graph by maker-knowledge-mining

114 nips-2008-Large Margin Taxonomy Embedding for Document Categorization

Source: pdf

Author: Kilian Q. Weinberger, Olivier Chapelle

Abstract: Applications of multi-class classiﬁcation, such as document categorization, often appear in cost-sensitive settings. Recent work has signiﬁcantly improved the state of the art by moving beyond “ﬂat” classiﬁcation through incorporation of class hierarchies [4]. We present a novel algorithm that goes beyond hierarchical classiﬁcation and estimates the latent semantic space that underlies the class hierarchy. In this space, each class is represented by a prototype and classiﬁcation is done with the simple nearest neighbor rule. The optimization of the semantic space incorporates large margin constraints that ensure that for each instance the correct class prototype is closer than any other. We show that our optimization is convex and can be solved efﬁciently for large data sets. Experiments on the OHSUMED medical journal data base yield state-of-the-art results on topic categorization. 1

reference text

[1] D. Blei, A. Ng, M. Jordan, and J. Lafferty. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3(4-5):993–1022, 2003.

[2] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[3] A. Budanitsky and G. Hirst. Semantic distance in wordnet: An experimental, application-oriented evaluation of ﬁve measures. In Workshop on WordNet and Other Lexical Resources, in the North American Chapter of the Association for Co mputational Linguistics (NAACL), 2001.

[4] L. Cai and T. Hofmann. Hierarchical document categorization with support vector machines. In ACM 13th Conference on Information and Knowledge Management, 2004.

[5] T. Cox and M. Cox. Multidimensional Scaling. Chapman & Hall, London, 1994.

[6] K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2:265–292, 2001.

[7] S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391–407, 1990.

[8] S. Dumais and H. Chen. Hierarchical classiﬁcation of Web content. In Proceedings of SIGIR-00, 23rd ACM International Conference on Research and Development in Information Retrieval, pages 256–263. ACM Press, New York, US, 2000.

[9] W. Hersh, C. Buckley, T. J. Leone, and D. Hickam. OHSUMED: an interactive retrieval evaluation and new large test collection for research. In SIGIR ’94: Proceedings of the 17th annual international ACM conference on Research and development in information retrieval, pages 192–201. Springer-Verlag New York, Inc., 1994.

[10] G. Karypis, E. Hong, and S. Han. Concept indexing a fast dimensionality reduction algorithm with applications to document retrieval & categorization, 2000. Technical Report: 00-016 karypis, han@cs.umn.edu Last updated on.

[11] T.-Y. Liu, Y. Yang, H. Wan, H.-J. Zeng, Z. Chen, and W.-Y. Ma. Support vector machines classiﬁcation with a very large-scale taxonomy. SIGKDD Explorations Newsletter, 7(1):36–43, 2005.

[12] R. Rifkin and A. Klautau. In Defense of One-Vs-All Classiﬁcation. The Journal of Machine Learning Research, 5:101–141, 2004.

[13] F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.

[14] F. Sha and L. K. Saul. Large margin hidden markov models for automatic speech recognition. In Advances in Neural Information Processing Systems 19, Cambridge, MA, 2007. MIT Press.

[15] A. Weigend, E. Wiener, and J. Pedersen. Exploiting Hierarchy in Text Categorization. Information Retrieval, 1(3):193–216, 1999.

[16] K. Q. Weinberger and L. K. Saul. Fast solvers and efﬁcient implementations for distance metric learning. pages 1160–1167, 2008. 8