nips nips2002 nips2002-67 nips2002-67-reference knowledge-graph by maker-knowledge-mining

67 nips-2002-Discriminative Binaural Sound Localization


Source: pdf

Author: Ehud Ben-reuven, Yoram Singer

Abstract: Time difference of arrival (TDOA) is commonly used to estimate the azimuth of a source in a microphone array. The most common methods to estimate TDOA are based on finding extrema in generalized crosscorrelation waveforms. In this paper we apply microphone array techniques to a manikin head. By considering the entire cross-correlation waveform we achieve azimuth prediction accuracy that exceeds extrema locating methods. We do so by quantizing the azimuthal angle and treating the prediction problem as a multiclass categorization task. We demonstrate the merits of our approach by evaluating the various approaches on Sony’s AIBO robot.


reference text

[1] C. H. Knapp and G. C. Carter. The generalized correlation method for estimation of time delay. IEEE Transactions on ASSP, 24(4):320-327,1976.

[2] M. Omologo and P. Svaizer. Acoustic event localization using a crosspowerspectrum phase based technique. Proceedings of ICASSP1994, Adelaide, Australia, 1994.

[3] T. Gustafsson and B.D. Rao. Source Localization in Reverberant Environments: Statistical Analysis. Submitted to IEEE Trans. on Speech and Audio Processing, 2000.

[4] N. Strobel and R. Rabenstein. Classification of Time Delay Estimates for Robust Speaker Localization ICASSP, Phoenix, USA, March 1999.

[5] J. Benesty Adaptive eigenvalue decomposition algorithm for passive acoustic source localization J. Acoust. Soc. Am. 107 (1), January 2000

[6] K. Crammer and Y. Singer. Ultraconservative online algorithms for multiclass problems. In Proc. of the 14th Annual Conf. on Computational Learning Theory, 2001.

[7] R. O. Duda, P. E. Hart. Pattern Classification. Wiley, 1973.

[8] B. Porat. A course in Digital Signal Processing. Wiley, 1997.

[9] F. Rosenblatt. The Perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–407, 1958.

[10] B. Widrow and M. E. Hoff. Adaptive switching circuits. 1960 IRE WESCON Convention Record, pages 96–104, 1960.

[11] P. Aarabi, A. Mahdavi. The Relation Between Speech Segment Selectivity and Time-Delay Estimation Accuracy. In Proc. of IEEE Conf. on Acoustics Speech and Signal Processing, 2002.