emnlp emnlp2013 emnlp2013-64 emnlp2013-64-reference knowledge-graph by maker-knowledge-mining

64 emnlp-2013-Discriminative Improvements to Distributional Sentence Similarity


Source: pdf

Author: Yangfeng Ji ; Jacob Eisenstein

Abstract: Matrix and tensor factorization have been applied to a number of semantic relatedness tasks, including paraphrase identification. The key idea is that similarity in the latent space implies semantic relatedness. We describe three ways in which labeled data can improve the accuracy of these approaches on paraphrase classification. First, we design a new discriminative term-weighting metric called TF-KLD, which outperforms TF-IDF. Next, we show that using the latent representation from matrix factorization as features in a classification algorithm substantially improves accuracy. Finally, we combine latent features with fine-grained n-gram overlap features, yielding performance that is 3% more accurate than the prior state-of-the-art.


reference text

Sanjeev Arora, Rong Ge, and Ankur Moitra. 2012. Learning Topic Models - Going beyond SVD. In FOCS, pages 1–10. Rahul Bhagat and Eduard Hovy. 2013. What Is a Paraphrase? Computational Linguistics. William Blacoe and Mirella Lapata. 2012. A Comparison of Vector-based Representations for Semantic Composition. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 546–556, Stroudsburg, PA, USA. Association for Computational Linguistics. Fan Bu, Hang Li, and Xiaoyan Zhu. 2012. String Rewriting kernel. In Proceedings of ACL, pages 449– 458. Association for Computational Linguistics. Dipanjan Das and Noah A. Smith. 2009. Paraphrase identification as probabilistic quasi-synchronous recognition. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, pages 468–476, Stroudsburg, PA, USA. Association for Computational Linguistics. Bill Dolan, Chris Quirk, and Chris Brockett. 2004. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. In COLING. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research, 9: 1871–1874. Alexander Gammerman, Volodya Vovk, and Vladimir Vapnik. 1998. Learning by transduction. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, pages 148–155. Morgan Kaufmann Publishers Inc. Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The Paraphrase Database. In Proceedings of NAACL, pages 758–764. Association for Computational Linguistics. Weiwei Guo and Mona Diab. 2012. Modeling Sentences in the Latent Space. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pages 864–872, Stroudsburg, PA, USA. Association for Computational Linguistics. David Kauchak and Regina Barzilay. 2006. Paraphrasing for automatic evaluation. In Proceedings of NAACL, pages 455–462. Association for Computational Linguistics. Thomas Landauer, Peter W. Foltz, and Darrel Laham. 1998. Introduction to Latent Semantic Analysis. Discource Processes, 25:259–284. Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for Non-Negative Matrix Factorization. In Advances in Neural Information Processing Systems (NIPS). Nitin Madnani, Joel R. Tetreault, and Martin Chodorow. 2012. Re-examining Machine Translation Metrics for Paraphrase Identification. In HLT-NAACL, pages 182– 190. The Association for Computational Linguistics. Rada Mihalcea, Courtney Corley, and Carlo Strapparava. 2006. Corpus-based and knowledge-based measures of text semantic similarity. In AAAI. Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. The MIT Press. Joakim Nivre, Johan Hall, Jens Nilsson, Atanas Chanev, G ¨ulsen Eryigit, Sandra K ¨ubler, Svetoslav Marinov, and Erwin Marsi. 2007. MaltParser: A languageindependent system for data-driven dependency parsing. Natural Language Engineering, 13(2):95–135. 896 Saˇ sa Petrovi c´, Miles Osborne, and Victor Lavrenko. 2010. Streaming first story detection with application to twitter. In Proceedings of HLT-NAACL, pages 181– 189. Association for Computational Linguistics. Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. 2011. Dynamic Pooling And Unfolding Recursive Autoencoders For Paraphrase Detection. In Advances in Neural Information Processing Systems (NIPS). Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word Representation: A Simple and General Method for Semi-Supervised Learning. In ACL, pages 384– 394. Peter D. Turney and Patrick Pantel. 2010. From Frequency to Meaning: Vector Space Models of Semantics. JAIR, 37: 141–188. Ssephen Wan, Mark Dras, Robert Dale, and Cecile Paris. 2006. Using Dependency-based Features to Take the “Para-farce” out of Paraphrase. In Proceedings of the Australasian Language Technology Workshop. Dekai Wu. 2005. Recognizing paraphrases and textual entailment using inversion transduction grammars. In Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, pages 25–30. Association for Computational Linguistics. Wei Xu, Xin Liu, and Yihong Gong. 2003. Document Clustering based on Non-Negative Matrix Factorization. In SIGIR, pages 267–273.