nips nips2005 nips2005-175 nips2005-175-reference knowledge-graph by maker-knowledge-mining

175 nips-2005-Sequence and Tree Kernels with Statistical Feature Mining


Source: pdf

Author: Jun Suzuki, Hideki Isozaki

Abstract: This paper proposes a new approach to feature selection based on a statistical feature mining technique for sequence and tree kernels. Since natural language data take discrete structures, convolution kernels, such as sequence and tree kernels, are advantageous for both the concept and accuracy of many natural language processing tasks. However, experiments have shown that the best results can only be achieved when limited small sub-structures are dealt with by these kernels. This paper discusses this issue of convolution kernels and then proposes a statistical feature selection that enable us to use larger sub-structures effectively. The proposed method, in order to execute efficiently, can be embedded into an original kernel calculation process by using sub-structure mining algorithms. Experiments on real NLP tasks confirm the problem in the conventional method and compare the performance of a conventional method to that of the proposed method.


reference text

[1] N. Cancedda, E. Gaussier, C. Goutte, and J.-M. Renders. Word-Sequence Kernels. Journal of Machine Learning Research, 3:1059–1082, 2003.

[2] M. Collins and N. Duffy. Convolution kernels for natural language. In Proc. of Neural Information Processing Systems (NIPS’2001), 2001.

[3] D. Haussler. Convolution kernels on discrete structures. In Technical Report UCS-CRL-99-10. UC Santa Cruz, 1999.

[4] T. Joachims. Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In Proc. of European Conference on Machine Learning (ECML ’98), pages 137–142, 1998.

[5] H. Kashima and T. Koyanagi. Kernels for Semi-Structured Data. In Proc. 19th International Conference on Machine Learning (ICML2002), pages 291–298, 2002.

[6] X. Li and D. Roth. Learning Question Classifiers. In Proc. of the 19th International Conference on Computational Linguistics (COLING 2002), pages 556–562, 2002.

[7] H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. Text Classification Using String Kernel. Journal of Machine Learning Research, 2:419–444, 2002.

[8] S. Morishita and J. Sese. Traversing Itemset Lattices with Statistical Metric Pruning. In Proc. of ACM SIGACT-SIGMOD-SIGART Symp. on Database Systems (PODS’00), pages 226–236, 2000.

[9] J. Pei, J. Han, B. Mortazavi-Asl, and H. Pinto. PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In Proc. of the 17th International Conference on Data Engineering (ICDE 2001), pages 215–224, 2001.

[10] J. Suzuki, Y. Sasaki, and E. Maeda. Kernels for Structured Natural Language Data. In Proc. of the 17th Annual Conference on Neural Information Processing Systems (NIPS2003), 2003.

[11] C. Watkins. Dynamic alignment kernels. In Technical Report CSD-TR-98-11. Royal Holloway, University of London Computer Science Department, 1999.

[12] M. J. Zaki. Efficiently Mining Frequent Trees in a Forest. In Proc. of the 8th International Conference on Knowledge Discovery and Data Mining (KDD’02), pages 71–80, 2002.