emnlp emnlp2010 emnlp2010-84 emnlp2010-84-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Mark Dredze ; Aren Jansen ; Glen Coppersmith ; Ken Church
Abstract: There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and information retrieval. Many of these boxes, especially ASR, are often based on considerable linguistic resources. We would like to be able to process spoken documents with few (if any) resources. Moreover, connecting black boxes in series tends to multiply errors, especially when the key terms are out-ofvocabulary (OOV). The proposed alternative applies text processing directly to the speech without a dependency on ASR. The method finds long (∼ 1 sec) repetitions in speech, fainndd scl luostnegrs t∼he 1m sinecto) pseudo-terms (roughly phrases). Document clustering and classification work surprisingly well on pseudoterms; performance on a Switchboard task approaches a baseline using gold standard man- ual transcriptions.
Enrique Amig o´, Julio Gonzalo, Javier Artiles, and Felisa Verdejo. 2009. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval, 12(4). A. Bagga and B. Baldwin. 1998. Entity-based crossdocument coreferencing using the vector space model. In Proceedings of the 1 international conference on 7th Computational linguistics-Volume 1, pages 79–85. Association for Computational Linguistics. A.L. Berger, V.J.D. Pietra, and S.A.D. Pietra. 1996. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39–71 . Chris Callison-Burch and Mark Dredze. 2010. Creating speech and language data with Amazon’s Mechanical Turk. In Workshop on Creating Speech and Language Data With Mechanical Turk at NAACL-HLT. K. W. Church and J. I. Helfman. 1993. Dotplot: A program for exploring self-similarity in millions of lines of text and code. Journal of Computational and Graphical Statistics. Aaron Clauset, Mark E J Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Physical Review E, 70. Koby Crammer, Ofer Dekel, Joseph Keshet, Shai ShalevShwartz, and Yoram Singer. 2006. Online passiveaggressive algorithms. Journal of Machine Learning Research (JMLR). Koby Crammer, Mark Dredze, and Alex Kulesza. 2009. Multi-class confidence weighted algorithms. In Empirical Methods in Natural Language Processing (EMNLP). Mark Dredze, Koby Crammer, and Fernando Pereira. 2008. Confidence-weighted linear classification. In International Conference on Machine Learning (ICML). Alvin Garcia and Herbert Gish. 2006. Keyword spotting of arbitrary words using minimal speech resources. In ICASSP. J.J. Godfrey, E.C. Holliman, and J. McDaniel. 1992. SWITCHBOARD: Telephone speech corpus for research and development. In ICASSP. Timothy J. Hazen and Anna Margolis. 2008. Discriminative feature weighting using MCE training for topic identification of spoken audio recordings. In ICASSP. Timothy J. Hazen, Fred Richardson, and Anna Margolis. 2007. Topic identification from audio recordings using word and phone recognition lattices. In IEEE Workshop on Automatic Speech Recognition and Understanding. Aren Jansen, Kenneth Church, and Hynek Hermansky. 2010. Towards spoken term discovery at scale with zero resources. In Interspeech. T. Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In European Conference on Machine Learning (ECML). George Karypis. 2003. CLUTO: A software package for clustering high-dimensional data sets. Technical Report 02-017, University of Minnesota, Dept. of Computer Science. Igor Malioutov, Alex Park, Regina Barzilay, and James Glass. 2007. Making Sense of Sound: Unsupervised Topic Segmentation Over Acoustic Input. In ACL. Scott Novotney and Richard Schwartz. 2009. Analysis of low-resource acoustic model self-training. In Interspeech. 470 Alex Park and James R. Glass. 2008. Unsupervised pattern discovery in speech. IEEE Transactions of Audio, Speech, and Language Processing. R. Snow, B. O’Connor, D. Jurafsky, and A.Y. Ng. 2008. Cheap and fast—but is it good?: Evaluating nonexpert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 254–263. Association for Computational Linguistics. S. Thomas, S. Ganapathy, and H. Hermansky. 2009. Phoneme recognition using spectral envelope and modulation frequency features. In Proc. of ICASSP. H.J. Zeng, Q.C. He, Z. Chen, W.Y. Ma, and J. Ma. 2004. Learning to cluster web search results. In Conference on Research and development in information retrieval (SIGIR).