nips nips2005 nips2005-174 nips2005-174-reference knowledge-graph by maker-knowledge-mining

174 nips-2005-Separation of Music Signals by Harmonic Structure Modeling

Source: pdf

Author: Yun-gang Zhang, Chang-shui Zhang

Abstract: Separation of music signals is an interesting but difﬁcult problem. It is helpful for many other music researches such as audio content analysis. In this paper, a new music signal separation method is proposed, which is based on harmonic structure modeling. The main idea of harmonic structure modeling is that the harmonic structure of a music signal is stable, so a music signal can be represented by a harmonic structure model. Accordingly, a corresponding separation algorithm is proposed. The main idea is to learn a harmonic structure model for each music signal in the mixture, and then separate signals by using these models to distinguish harmonic structures of different signals. Experimental results show that the algorithm can separate signals and obtain not only a very high Signalto-Noise Ratio (SNR) but also a rather good subjective audio quality. 1

reference text

[1] J. S. Downie, “Music information retrieval,” Annual Review of Information Science and Technology, vol. 37, pp. 295–340, 2003.

[2] Roger Dannenberg, “Music understanding by computer,” in IAKTA/LIST International Workshop on Knowledge Technology in the Arts Proc., 1993, pp. 41–56.

[3] G. J. Brown and M. Cooke, “Computational auditory scene analysis,” Computer Speech and Language, vol. 8, no. 4, pp. 297–336, 1994.

[4] M.Goto, “A robust predominant-f0 estimation method for real-time detection of melody and bass lines in cd recordings,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2000), 2000, pp. 757–760.

[5] J. Pinquier, J. Rouas, and R. Andre-Obrecht, “Robust speech / music classiﬁcation in audio documents,” in 7th International Conference On Spoken Language Processing (ICSLP), 2002, pp. 2005–2008.

[6] P. Comon, “Independent component analysis, a new concept?,” Signal Processing, vol. 36, pp. 287–314, 1994.

[7] Gil-Jin Jang and Te-Won Lee, “A probabilistic approach to single channel blind signal separation,” in Neural Information Processing Systems 15 (NIPS2002), 2003.

[8] Yazhong Feng, Yueting Zhuang, and Yunhe Pan, “Popular music retrieval by independent component analysis,” in ISMIR, 2002, pp. 281–282.

[9] Peter Vanroose, “Blind source separation of speech and background music for improved speech recognition,” in The 24th Symposium on Information Theory, May 2003, pp. 103–108.

[10] X. Serra, “Musical sound modeling with sinusoids plus noise,” in Musical Signal Processing, C. Roads, S. Popea, A. Picialli, and G. De Poli, Eds. Swets & Zeitlinger Publishers, 1997.

[11] E. Terhardt, “Calculating virtual pitch,” Hearing Res., vol. 1, pp. 155–182, 1979.

[12] Yungang Zhang, Changshui Zhang, and Shijun Wang, “Clustering in knowledge embedded space,” in ECML, 2003, pp. 480–491.

[13] R. C. Maher and J. W. Beauchamp, “Fundamental frequency estimation of musical signals using a two-way mismatch procedure,” Journal of the Acoustical Society of America, vol. 95, no. 4, pp. 2254–2263, 1994.

[14] Serguei Koval, Mikhail Stolbov, and Mikhail Khitrov, “Broadband noise cancellation systems: new approach to working performance optimization,” in EUROSPEECH’99, 1999, pp. 2607–2610.

[15] Anssi Klapuri, “Automatic transcription of music,” M.S. thesis, Tampere University of Technology, Finland, 1998.

[16] Keerthi C. Nagaraj., “Toward automatic transcription - pitch tracking in polyphonic environment,” Literature survey, Mar. 2003.

[17] Hirokazu Kameoka, Takuya Nishimoto, and Shigeki Sagayama, “Separation of harmonic structures based on tied gaussian mixture model and information criterion for concurrent sounds,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP04), 2004.