nips nips2001 nips2001-20 nips2001-20-reference knowledge-graph by maker-knowledge-mining

20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

Source: pdf

Author: William M. Campbell

Abstract: A novel approach for comparing sequences of observations using an explicit-expansion kernel is demonstrated. The kernel is derived using the assumption of the independence of the sequence of observations and a mean-squared error training criterion. The use of an explicit expansion kernel reduces classiﬁer model size and computation dramatically, resulting in model sizes and computation one-hundred times smaller in our application. The explicit expansion also preserves the computational advantages of an earlier architecture based on mean-squared error training. Training using standard support vector machine methodology gives accuracy that signiﬁcantly exceeds the performance of state-of-the-art mean-squared error training for a speaker recognition task.

reference text

[1] Vincent Wan and William M. Campbell, “Support vector machines for veriﬁcation and identiﬁcation,” in Neural Networks for Signal Processing X, Proceedings of the 2000 IEEE Signal Processing Workshop, 2000, pp. 775–784.

[2] Aravind Ganapathiraju and Joseph Picone, “Hybrid SVM/HMM architectures for speech recognition,” in Speech Transcription Workshop, 2000.

[3] John C. Platt, “Probabilities for SV machines,” in Advances in Large Margin Classiﬁers, Alexander J. Smola, Peter L. Bartlett, Bernhard Sch¨ lkopf, and Dale Schuurmans, Eds., pp. o 61–74. The MIT Press, 2000.

[4] Tommi S. Jaakkola and David Haussler, “Exploiting generative models in discriminative classiﬁers,” in Advances in Neural Information Processing 11, M. S. Kearns, S. A. Solla, and D. A. Cohn, Eds. 1998, pp. 487–493, The MIT Press.

[5] Nathan Smith, Mark Gales, and Mahesan Niranjan, “Data-dependent kernels in SVM classiﬁcation of speech patterns,” Tech. Rep. CUED/F-INFENG/TR.387, Cambridge University Engineering Department, 2001.

[6] Shai Fine, Jiˇ´ Navr´ til, and Ramesh A. Gopinath, “A hybrid GMM/SVM approach to speaker ri a recognition,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 2001.

[7] William M. Campbell and Khaled T. Assaleh, “Polynomial classiﬁer techniques for speaker veriﬁcation,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 1999, pp. 321–324.

[8] Kevin R. Farrell, Richard J. Mammone, and Khaled T. Assaleh, “Speaker recognition using neural networks and conventional classiﬁers,” IEEE Trans. on Speech and Audio Processing, vol. 2, no. 1, pp. 194–205, Jan. 1994.

[9] Douglas A. Reynolds, “Automatic speaker recognition using Gaussian mixture speaker models,” The Lincoln Laboratory Journal, vol. 8, no. 2, pp. 173–192, 1995.

[10] Michael J. Carey, Eluned S. Parris, and John S. Bridle, “A speaker veriﬁcation system using alpha-nets,” in Proceedings of the International Conference on Acoustics Speech and Signal Processing, 1991, pp. 397–400.

[11] J¨ rgen Sch¨ rmann, Pattern Classiﬁcation, John Wiley and Sons, Inc., 1996. u u

[12] Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech Recognition, PrenticeHall, 1993.

[13] William M. Campbell and C. C. Broun, “A computationally scalable speaker recognition system,” in Proceedings of EUSIPCO, 2000, pp. 457–460.

[14] Joseph P. Campbell, Jr., “Testing with the YOHO CD-ROM voice veriﬁcation corpus,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 1995, pp. 341–344.

[15] Ronan Collobert and Samy Bengio, “Support vector machines for large-scale regression problems,” Tech. Rep. IDIAP-RR 00-17, IDIAP, 2000.