jmlr jmlr2010 jmlr2010-15 jmlr2010-15-reference knowledge-graph by maker-knowledge-mining

15 jmlr-2010-Approximate Tree Kernels

Source: pdf

Author: Konrad Rieck, Tammo Krueger, Ulf Brefeld, Klaus-Robert Müller

Abstract: Convolution kernels for trees provide simple means for learning with tree-structured data. The computation time of tree kernels is quadratic in the size of the trees, since all pairs of nodes need to be compared. Thus, large parse trees, obtained from HTML documents or structured network data, render convolution kernels inapplicable. In this article, we propose an effective approximation technique for parse tree kernels. The approximate tree kernels (ATKs) limit kernel computation to a sparse subset of relevant subtrees and discard redundant structures, such that training and testing of kernel-based learning methods are signiﬁcantly accelerated. We devise linear programming approaches for identifying such subsets for supervised and unsupervised learning tasks, respectively. Empirically, the approximate tree kernels attain run-time improvements up to three orders of magnitude while preserving the predictive accuracy of regular tree kernels. For unsupervised tasks, the approximate tree kernels even lead to more accurate predictions by identifying relevant dimensions in feature space. Keywords: tree kernels, approximation, kernel methods, convolution kernels

reference text

C. Bockermann, M. Apel, and M. Meier. Learning SQL for database intrusion detection using context-sensistive modelling. In Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA), 2009. to appear. N. Borisov, D.J. Brumley, H. Wang, J. Dunagan, P. Joshi, and C. Guo. Generic application-level protocol analyzer and its language. In Proc. of Network and Distributed System Security Symposium (NDSS), 2007. M. L. Braun, J. Buhmann, and K.-R. M¨ ller. On relevant dimensions in kernel feature spaces. u Journal of Machine Learning Research, 9:1875–1908, Aug 2008. C. Castillo, D. Donato, L. Becchetti, P. Boldi, S. Leonardi, M. Santini, and S. Vigna. A reference collection for web spam. SIGIR Forum, 40(2):11–24, 2006. URL http://portal.acm.org/ citation.cfm?id=1189703. C.-C. Chang and C.-J. Lin. LIBSVM: Introduction and benchmarks. Technical report, Department of Computer Science and Information Engineering, National Taiwan University, Taipei, 2000. E. Charniak. A maximum-entropy-inspired parser. Technical Report CS-99-12, Brown University, 1999. E. Cilia and A. Moschitti. Advanced tree-based kernels for protein classiﬁcation. In Artiﬁcial Intelligence and Human-Oriented Computing (AI*IA), 10th Congress, LNCS, pages 218–229, 2007. M. Collins and N. Duffy. Convolution kernel for natural language. In Advances in Neural Information Proccessing Systems (NIPS), volume 16, pages 625–632, 2002. N. Cristianini, J. Shawe-Taylor, A. Elisseeff, and J. S. Kandola. On kernel target alignment. In Advances in Neural Information Proccessing Systems (NIPS), volume 14, pages 367–737, 2001. I. Drost and T. Scheffer. Thwarting the nigritude ultramarine: Learning to identify link spam. In Proc. of the Eurpoean Conference on Machine Learning (ECML), 2005. P. D¨ ssel, C. Gehl, P. Laskov, and K. Rieck. Incorporation of application layer protocol syntax u into anomaly detection. In Proc. of International Conference on Information Systems Security (ICISS), pages 188–202, 2008. E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo. Applications of Data Mining in Computer Security, chapter A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. Kluwer, 2002. R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol – HTTP/1.1. RFC 2616 (Draft Standard), June 1999. URL http://www.ietf. org/rfc/rfc2616.txt. Updated by RFC 2817. D. Haussler. Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, UC Santa Cruz, July 1999. 578 A PPROXIMATE T REE K ERNELS J.E. Hopcroft and J.D. Motwani, R. Ullmann. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 2 edition, 2001. H. Kashima and T. Koyanagi. Kernels for semi-structured data. In International Conference on Machine Learning (ICML), pages 291–298, 2002. D. Knuth. The Art of Computer Programming, volume 3. Addison-Wesley, 1973. C. Kruegel and G. Vigna. Anomaly detection of web-based attacks. In Proc. of 10th ACM Conf. on Computer and Communications Security, pages 251–261, 2003. P. Laskov, K. Rieck, and K.-R. M¨ ller. Machine learning for intrusion detection. In Mining Massive u Data Sets for Security, pages 366–373. IOS press, 2008. X. Li and D. Roth. Learning question classiﬁers. In International Conference on Computational Linguistics (ICCL), pages 1–7, 2002. doi: http://dx.doi.org/10.3115/1072228.1072378. R. Lippmann, J.W. Haines, D.J. Fried, J. Korba, and K. Das. The 1999 DARPA off-line intrusion detection evaluation. Computer Networks, 34(4):579–595, 2000. C. Manning and H. Sch¨ tze. Foundations of Statistical Natural Language Processing. MIT Press, u 1999. S. Mika, B. Sch¨ lkopf, A.J. Smola, K.-R. M¨ ller, M. Scholz, and G. R¨ tsch. Kernel PCA and o u a de–noising in feature spaces. In M.S. Kearns, S.A. Solla, and D.A. Cohn, editors, Advances in Neural Information Processing Systems, volume 11, pages 536–542. MIT Press, 1999. A. Moschitti. Making tree kernels practical for natural language processing. In Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2006a. A. Moschitti. Efﬁcient convolution kernels for dependency and constituent syntactic trees. In European Conference on Machine Learning (ECML), 2006b. A. Moschitti and F.M. Zanzotto. Fast and effective kernels for relation learning from texts. In International Conference on Machine Learning (ICML), 2007. K.-R. M¨ ller, S. Mika, G. R¨ tsch, K. Tsuda, and B. Sch¨ lkopf. An introduction to kernel-based u a o learning algorithms. IEEE Neural Networks, 12(2):181–201, May 2001. R. Pang, V. Paxson, R. Sommer, and L.L. Peterson. binpac: a yacc for writing application protocol parsers. In Proc. of ACM Internet Measurement Conference, pages 289–300, 2006. V. Paxson and R. Pang. A high-level programming environment for packet trace anonymization and transformation. In Proc. of Applications, Technologies, Architectures, and Protocols for Computer Communications SIGCOMM, pages 339 – 351, 2003. J. Postel and J. Reynolds. File Transfer Protocol. RFC 959 (Standard), October 1985. URL http://www.ietf.org/rfc/rfc959.txt. Updated by RFCs 2228, 2640, 2773, 3659. K. Rieck and P. Laskov. Language models for detection of unknown attacks in network trafﬁc. Journal in Computer Virology, 2(4):243–256, 2007. 579 ¨ R IECK , K RUEGER , B REFELD AND M ULLER K. Rieck, U. Brefeld, and T. Krueger. Approximate kernels for trees. Technical Report FIRST 5/2008, Fraunhofer Institute FIRST, September 2008. B. Sch¨ lkopf and A.J. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002. o B. Sch¨ lkopf, A.J. Smola, and K.-R. M¨ ller. Nonlinear component analysis as a kernel eigenvalue o u problem. Neural Computation, 10:1299–1319, 1998. B. Sch¨ lkopf, J. Platt, J. Shawe-Taylor, A.J. Smola, and R.C. Williamson. Estimating the support o of a high-dimensional distribution. TR 87, Microsoft Research, Redmond, WA, 1999. J. Shawe-Taylor and N. Cristianini. Kernel Methods for Pattern Analysis. Cambridge University Press, 2004. J. Suzuki and H. Isozaki. Sequence and tree kernels with statistical feature mining. In Advances in Neural Information Proccessing Systems (NIPS), volume 17, 2005. J. Suzuki, H. Isozaki, and E. Maeda. Convolution kernels with feature selection for natural language processing tasks. In Annual Meeting on Association for Computational Linguistics (ACL), 2004. D.M.J. Tax and R.P.W. Duin. Support vector domain description. Pattern Recognition Letters, 20 (11–13):1191–1199, 1999. V.N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995. S.V.N. Vishwanathan and A.J. Smola. Fast kernels for string and tree matching. In Advances in Neural Information Proccessing Systems (NIPS), pages 569–576, 2003. E.M. Voorhees. Overview of the trec 2004 question answering track. In Proc. of the Thirteenth Text Retreival Conference (TREC), 2004. G. Wondracek, P.M. Comparetti, C. Kruegel, and E. Kirda. Automatic network protocol analysis. In Proc. of Network and Distributed System Security Symposium (NDSS), 2008. B. Wu and B. D. Davison. Identifying link farm spam pages. In WWW ’05: Special Interest Tracks, 14th International Conference on World Wide Web, 2005. D. Zhang and W. S. Lee. Question classiﬁcation using support vector machines. In Annual International ACM SIGIR Conference, pages 26–32, 2003. 580