emnlp emnlp2010 emnlp2010-104 emnlp2010-104-reference knowledge-graph by maker-knowledge-mining

104 emnlp-2010-The Necessity of Combining Adaptation Methods

Source: pdf

Author: Ming-Wei Chang ; Michael Connor ; Dan Roth

Abstract: Problems stemming from domain adaptation continue to plague the statistical natural language processing community. There has been continuing work trying to find general purpose algorithms to alleviate this problem. In this paper we argue that existing general purpose approaches usually only focus on one of two issues related to the difficulties faced by adaptation: 1) difference in base feature statistics or 2) task differences that can be detected with labeled data. We argue that it is necessary to combine these two classes of adaptation algorithms, using evidence collected through theoretical analysis and simulated and real-world data experiments. We find that the combined approach often outperforms the individual adaptation approaches. By combining simple approaches from each class of adaptation algorithm, we achieve state-of-the-art results for both Named Entity Recognition adaptation task and the Preposition Sense Disambiguation adaptation task. Second, we also show that applying an adaptation algorithm that finds shared representation between domains often impacts the choice in adaptation algorithm that makes use of target labeled data.

reference text

Rie Kubota Ando and Tong Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In EMNLP. P. F. Brown, V. J. Della Pietra, P. V. deSouza, J. C. Lai, and R. L. Mercer. 1992. Class-Based n-gram Models of Natural Language. Computational Linguistics. Ciprian Chelba and Alex Acero. 2004. Adaptation of maximum entropy capitalizer: Little data can help a lot. In Dekang Lin and Dekai Wu, editors, EMNLP. D. Dahlmeier, H. T. Ng, and T. Schultz. 2009. Joint learning of preposition senses and semantic roles of prepositional phrases. In EMNLP. Hal Daum e´ III. 2007. Frustratingly easy domain adaptation. In ACL. T. Evgeniou and M. Pontil. 2004. Regularized multi– task learning. In KDD. J. R. Finkel and C. D. Manning. 2009. Hierarchical bayesian domain adaptation. In NAACL. C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. 2008. A dual coordinate descent method for large-scale linear svm. In ICML. Fei Huang and Alexander Yates. 2009. Distributional representations for handling sparsity in supervised sequence-labeling. In ACL. Jing Jiang and ChengXiang Zhai. 2007. Instance weighting for domain adaptation in nlp. In ACL. T. Koo, X. Carreras, and M. Collins. 2008. Simple semisupervised dependency parsing. In ACL. J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML. P. Liang. 2005. Semi-supervised learning for natural language. Master’s thesis, Massachusetts Institute of Technology. A. Novikoff. 1963. On convergence proofs for perceptrons. In Proceeding of the Symposium on the Mathematical Theory of Automata. L. Ratinov and D. Roth. 2009. Design challenges and misconceptions in named entity recognition. In Proc. of the Annual Conference on Computational Natural Language Learning (CoNLL). Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the conll-2003 shared task: Languageindependent named entity recognition. In Walter Daelemans and Miles Osborne, editors, Proceedings of CoNLL-2003. S. Tratz and D. Hovy. 2009. Disambiguation of preposition sense using linguistically motivated features. In NAACL. Tong Zhang. 2002. Covering number bounds of certain regularized linear function classes. J. Mach. Learn. Res. 777