nips nips2001 nips2001-20 nips2001-20-reference knowledge-graph by maker-knowledge-mining

20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition


Source: pdf

Author: William M. Campbell

Abstract: A novel approach for comparing sequences of observations using an explicit-expansion kernel is demonstrated. The kernel is derived using the assumption of the independence of the sequence of observations and a mean-squared error training criterion. The use of an explicit expansion kernel reduces classifier model size and computation dramatically, resulting in model sizes and computation one-hundred times smaller in our application. The explicit expansion also preserves the computational advantages of an earlier architecture based on mean-squared error training. Training using standard support vector machine methodology gives accuracy that significantly exceeds the performance of state-of-the-art mean-squared error training for a speaker recognition task.


reference text

[1] Vincent Wan and William M. Campbell, “Support vector machines for verification and identification,” in Neural Networks for Signal Processing X, Proceedings of the 2000 IEEE Signal Processing Workshop, 2000, pp. 775–784.

[2] Aravind Ganapathiraju and Joseph Picone, “Hybrid SVM/HMM architectures for speech recognition,” in Speech Transcription Workshop, 2000.

[3] John C. Platt, “Probabilities for SV machines,” in Advances in Large Margin Classifiers, Alexander J. Smola, Peter L. Bartlett, Bernhard Sch¨ lkopf, and Dale Schuurmans, Eds., pp. o 61–74. The MIT Press, 2000.

[4] Tommi S. Jaakkola and David Haussler, “Exploiting generative models in discriminative classifiers,” in Advances in Neural Information Processing 11, M. S. Kearns, S. A. Solla, and D. A. Cohn, Eds. 1998, pp. 487–493, The MIT Press.

[5] Nathan Smith, Mark Gales, and Mahesan Niranjan, “Data-dependent kernels in SVM classification of speech patterns,” Tech. Rep. CUED/F-INFENG/TR.387, Cambridge University Engineering Department, 2001.

[6] Shai Fine, Jiˇ´ Navr´ til, and Ramesh A. Gopinath, “A hybrid GMM/SVM approach to speaker ri a recognition,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 2001.

[7] William M. Campbell and Khaled T. Assaleh, “Polynomial classifier techniques for speaker verification,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 1999, pp. 321–324.

[8] Kevin R. Farrell, Richard J. Mammone, and Khaled T. Assaleh, “Speaker recognition using neural networks and conventional classifiers,” IEEE Trans. on Speech and Audio Processing, vol. 2, no. 1, pp. 194–205, Jan. 1994.

[9] Douglas A. Reynolds, “Automatic speaker recognition using Gaussian mixture speaker models,” The Lincoln Laboratory Journal, vol. 8, no. 2, pp. 173–192, 1995.

[10] Michael J. Carey, Eluned S. Parris, and John S. Bridle, “A speaker verification system using alpha-nets,” in Proceedings of the International Conference on Acoustics Speech and Signal Processing, 1991, pp. 397–400.

[11] J¨ rgen Sch¨ rmann, Pattern Classification, John Wiley and Sons, Inc., 1996. u u

[12] Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech Recognition, PrenticeHall, 1993.

[13] William M. Campbell and C. C. Broun, “A computationally scalable speaker recognition system,” in Proceedings of EUSIPCO, 2000, pp. 457–460.

[14] Joseph P. Campbell, Jr., “Testing with the YOHO CD-ROM voice verification corpus,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 1995, pp. 341–344.

[15] Ronan Collobert and Samy Bengio, “Support vector machines for large-scale regression problems,” Tech. Rep. IDIAP-RR 00-17, IDIAP, 2000.