acl acl2010 acl2010-245 acl2010-245-reference knowledge-graph by maker-knowledge-mining

245 acl-2010-Understanding the Semantic Structure of Noun Phrase Queries

Source: pdf

Author: Xiao Li

Abstract: Determining the semantic intent of web queries not only involves identifying their semantic class, which is a primary focus of previous works, but also understanding their semantic structure. In this work, we formally define the semantic structure of noun phrase queries as comprised of intent heads and intent modifiers. We present methods that automatically identify these constituents as well as their semantic roles based on Markov and semi-Markov conditional random fields. We show that the use of semantic features and syntactic features significantly contribute to improving the understanding performance.

reference text

Jaime Arguello, Fernando Diaz, Jamie Callan, and Jean-Francois Crespo. 2009. Sources of evidence for vertical selection. In SIGIR ’09: Proceedings of the 32st Annual International ACM SIGIR conference on Research and Development in Information Retrieval. Cory Barr, Rosie Jones, and Moira Regelson. 2008. The linguistic structure of English web-search queries. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 1021–1030. Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In ICML’05: Proceedings of the 22nd international conference on Machine learning, pages 89–96. Michael Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania. Jianfeng Gao, Jian-Yun Nie, Jian Zhang, Endong Xun, Ming Zhou, and Chang-Ning Huang. 2001 . Improving query translation for CLIR using statistical models. In SIGIR ’01: Proceedings of the 24th Annual International ACM SIGIR conference on Research and Development in Information Retrieval. Jinyoung Kim, Xiaobing Xue, and Bruce Croft. 2009. A probabilistic retrieval model for semistructured data. In ECIR ’09: Proceedings of the 31st European Conference on Information Retrieval, pages 228–239. John Lafferty, Andrew McCallum, and Ferdando Pereira. 2001 . Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning, pages 282–289. Fangtao Li, Xian Zhang, Jinhui Yuan, and Xiaoyan Zhu. 2008a. Classifying what-type questions by head noun tagging. In COLING’08: Proceedings of the 22nd International Conference on Computational Linguistics, pages 481–488. Xiao Li, Ye-Yi Wang, and Alex Acero. 2008b. Learning query intent from regularized click graph. In SIGIR ’08: Proceedings of the 31st Annual International ACM SIGIR conference on Research and Development in Information Retrieval, July. Xiao Li, Ye-Yi Wang, and Alex Acero. 2009. Extracting structured information from user queries with semi-supervised conditional random fields. In SIGIR ’09: Proceedings of the 32st Annual International ACM SIGIR conference on Research and Development in Information Retrieva. Mehdi Manshadi and Xiao Li. 2009. Semantic tagging of web search queries. In Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP. Andrew McCallum and Wei Li. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, pages 188– 191. Donald Metzler and Bruce Croft. 2005. Analysis of statistical question classification for fact-based ques- tions. Jounral of Information Retrieval, 8(3). Patrick Pantel and Marco Pennacchiotti. 2006. Espresso: Leveraging generic patterns for automatically har-vesting semantic relations. In Proceedings of the 21st International Conference on Computational Linguis-tics and the 44th annual meeting of the ACL, pages 113–120. Stelios Paparizos, Alexandros Ntoulas, John Shafer, and Rakesh Agrawal. 2009. Answering web queries using structured data sources. In Proceedings of the 35th SIGMOD international conference on Management of data. Marius Pasca and Benjamin Van Durme. 2007. What you seek is what you get: Extraction of class attributes from query logs. In IJCAI’07: Proceedings of the 20th International Joint Conference on Artificial Intelligence. Marius Pasca and Benjamin Van Durme. 2008. Weakly-supervised acquisition of open-domain classes and class attributes from web documents and query logs. In Proceedings of ACL-08: HLT. Marco Pennacchiotti and Patrick Pantel. 2009. Entity extraction via ensemble semantics. In EMNLP’09: Proceedings of Conference on Empirical Methods in Natural Language Processing, pages 238–247. Stephen Robertson, Hugo Zaragoza, and Michael Taylor. 2004. Simple BM25 extension to multiple weighted fields. In CIKM’04: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pages 42–49. Sunita Sarawagi and William W. Cohen. 2004. SemiMarkov conditional random fields for information extraction. In Advances in Neural Information Processing Systems (NIPS’04). Dou Shen, Jian-Tao Sun, Qiang Yang, and Zheng Chen. 2006. Building bridges for web query classification. In SIGIR ’06: Proceedings of the 29th Annual InternationalACMSIGIR conference on research and development in information retrieval, pages 13 1–138. 1345