nips nips2004 nips2004-19 nips2004-19-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Taku Kudo, Eisaku Maeda, Yuji Matsumoto
Abstract: This paper presents an application of Boosting for classifying labeled graphs, general structures for modeling a number of real-world data, such as chemical compounds, natural language texts, and bio sequences. The proposal consists of i) decision stumps that use subgraph as features, and ii) a Boosting algorithm in which subgraph-based decision stumps are used as weak learners. We also discuss the relation between our algorithm and SVMs with convolution kernels. Two experiments using natural language data and chemical compounds show that our method achieves comparable or even better performance than SVMs with convolution kernels as well as improves the testing efficiency. 1
[1] Leo Breiman. Prediction games and arching algoritms. Neural Computation, 11(7):1493 – 1518, 1999.
[2] Michael Collins and Nigel Duffy. Convolution kernels for natural language. In NIPS 14, Vol.1, pages 625–632, 2001.
[3] Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sicences, 55(1):119–139, 1996.
[4] David Haussler. Convolution kernels on discrete structures. Technical report, UC Santa Cruz (UCS-CRL-99-10), 1999.
[5] Hisashi Kashima and Teruo Koyanagi. Svm kernels for semi-structured data. In Proc. of ICML, pages 291–298, 2002.
[6] Hisashi Kashima, Koji Tsuda, and Akihiro Inokuchi. Marginalized kernels between labeled graphs. In Proc. of ICML, pages 321–328, 2003.
[7] Huma Lodhi, Craig Saunders, John Shawe-Taylor, Nello Cristianini, and Chris Watkins. Text classification using string kernels. Journal of Machine Learning Research, 2, 2002. ¨
[8] Gunnar. R¨ tsch, Takashi. Onoda, and Klaus-Robert Muller. Soft margins for AdaBoost. Maa chine Learning, 42(3):287–320, 2001.
[9] Robert E. Schapire, Yoav Freund, Peter Bartlett, and Wee Sun Lee. Boosting the margin: a new explanation for the effectiveness of voting methods. In Proc. of ICML, pages 322–330, 1997.
[10] Robert E. Schapire and Yoram Singer. BoosTexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135–168, 2000.
[11] Vladimir N. Vapnik. Statistical Learning Theory. Wiley-Interscience, 1998.
[12] Xifeng Yan and Jiawei Han. gspan: Graph-based substructure pattern mining. In Proc. of ICDM, pages 721–724, 2002. 6 We tested the performances on Linux with XEON 2.4Ghz dual processors.