nips nips2002 nips2002-191 nips2002-191-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Craig Saunders, Alexei Vinokourov, John S. Shawe-taylor
Abstract: In this paper we show how the generation of documents can be thought of as a k-stage Markov process, which leads to a Fisher kernel from which the n-gram and string kernels can be re-constructed. The Fisher kernel view gives a more flexible insight into the string kernel and suggests how it can be parametrised in a way that reflects the statistics of the training corpus. Furthermore, the probabilistic modelling approach suggests extending the Markov process to consider sub-sequences of varying length, rather than the standard fixed-length approach used in the string kernel. We give a procedure for determining which sub-sequences are informative features and hence generate a Finite State Machine model, which can again be used to obtain a Fisher kernel. By adjusting the parametrisation we can also influence the weighting received by the features . In this way we are able to obtain a logarithmic weighting in a Fisher kernel. Finally, experiments are reported comparing the different kernels using the standard Bag of Words kernel as a baseline. 1
[1] D. Haussler. Convolution kernels on discrete structures. Technical Report UCSC-CRL99-10, University of California, Santa Cruz, July 1999.
[2] T. Jaakkola, M. Diekhaus, and D. Haussler. Using the fisher kernel method to detect remote protein homologies. 7th Intell. Sys. Mol. Bio!. , pages 149- 158, 1999.
[3] T. Joachims. Making large-scale svm learning practical. In B. Schiilkopf, C. Burges, and A. Smola, editors , Advances in Kernel Methods - Support Vector Learning. MITPress, 1999.
[4] Y. Li, H. Zaragoza, R. Herbrich, J. Shawe-Taylor, and J. Kandola. The perceptron algorithm with uneven margins. In Proceedings of th e Nineteenth International Conference on Machine Learning (ICML '02), 2002.
[5] H Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and Watkins C. Text classification using string kernels. Journal of Machine Learning Research, (2):419- 444, 2002.
[6] H. Lodhi , J. Shawe-Taylor, N. Cristianini, and C. Watkins. Text classification using string kernels. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 563- 569. MIT Press, 2001.
[7] C. Watkins. Dynamic alignment kernels. Technical Report CSD-TR-98-11, Royal Holloway, University of London, January 1999.