nips nips2000 nips2000-138 nips2000-138-reference knowledge-graph by maker-knowledge-mining

138 nips-2000-The Use of Classifiers in Sequential Inference

Source: pdf

Author: Vasin Punyakanok, Dan Roth

Abstract: We study the problem of combining the outcomes of several different classifiers in a way that provides a coherent inference that satisfies some constraints. In particular, we develop two general approaches for an important subproblem - identifying phrase structure. The first is a Markovian approach that extends standard HMMs to allow the use of a rich observation structure and of general classifiers to model state-observation dependencies. The second is an extension of constraint satisfaction formalisms. We develop efficient combination algorithms under both models and study them experimentally in the context of shallow parsing.

reference text

[1] S. P. Abney. Parsing by chunks. In S. P. A. R. C. Berwick and C. Tenny, editors, Principle-based parsing: Computation and Psycho linguistics, IJages 257-278. Kluwer, Dordrecht, 1991.

[2] D. Appelt, J. Hobbs, J. Bear, D. Israel , and Nt Tyson. FASTUS: A finite-state processor for information extraction from real-world text. In Proc. of IJCAl, 1993.

[3] S. Argamon, 1. Dagan, and Y. Krymolowski. A memory-based approach to learning shallow natural language patterns. Journal of Experimental and Theoretical Artificial Intelligence, special issue on memory-based learning, 10:1- 22, 1999.

[4] C. Burge and S. Karlin. Finding the genes in genomic DNA. Current Opinion in Structural Biology, 8:346- 354, 1998.

[5] C. Cardie and D. Pierce. Error-driven pruning of treebanks grammars for base noun phrase identification. In Proceedings of ACL-98, pages 218- 224, 1998.

[6] A. Carlson, C. Cumby, J. Rosen, and D. Roth. The SNoW learning architecture. Technical Report UillCDCS-R-99-2101, UillC Computer Science Department, May 1999.

[7] K. W. Church. A stochastic parts program and noun phrase parser for unrestricted text. In Proc. of ACL Conference on Applied Natural Language Processing, 1988.

[8] 1: W. Fickett. The gene identification problem: An overview for developers. Computers and Chemistry, 20:103- 118,1996.

[9] D. Freitag and A. McCallum. Information extraction using HMMs and shrinkage. In Papers from the AAAJ-99 Workshop on Machine Learning for Information Extraction, 31- 36, 1999.

[10] A. R. Golding and D. Roth. A Winnow based approach to context-sensitive spelling correction. Machine Learning, 34(1-3):107-130, 1999.

[11] G. Greffenstette. Evaluation techniques for automatic semantic extraction: comparing semantic and window based approaches. In ACL'93 workshop on the Acquisition of Lexical Knowledge from Text, 1993.

[12] R. Grishman. The NYU system for MUC-6 or where's syntax? In B. Sundheim, editor, Proceedings of the Sixth Message Understanding Conference. Morgan Kaufmann Publishers, 1995.

[13] D. Gusfield and L. Pitt. A bounded approximation for the minimum cost 2-SAT problems. Algorithmica, 8:103-117, 1992.

[14] Z. S. Harris. Co-occurrence and transformation in linguistic structure. Language, 33(3):283340,1957.

[15] D. Haussler. Computational genefinding. Trends in Biochemical Sciences, Supplementary Guide to Bioinformatics, pages 12- 15, 1998.

[16] R. Khardon and D. Roth. Learning to reason. J. ACM, 44(5):697- 725, Sept. 1997.

[17] A. Mackworth. Constraint Satisfaction. In S. C. Shapiro, editor, Encyclopedia of Artificial Intelligence, pages 285- 293, 1992. Volume 1, second edition.

[18] M. P. Marcus, B. Santorini, and M. Marcinkiewicz. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313- 330, June 1993.

[19] A. McCallum, D. Freitag, and F. Pereira. Maximum entropy Markov models for information extraction and segmentation. In proceedings of ICML-2000, 2000. to appear.

[20] N. Morgan and H. Bourlard. Continuous speech recognition. IEEE Signal Processing Magazine, 12(3):24-42, 1995.

[21] M. Munoz, V. Punyakanok, D. Roth, and D. Zimak. A learning approach to shallow parsing. In EMNLP-VLC'99, 1999.

[22] L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257- 285 , 1989.

[23] L. A. Ramshaw and M. P. Marcus. Text chunking using transformation-based learning. In Proceedings of the Third Annual Workshop on Very Large Corpora, 1995 .

[24] D. Roth. Learning to resolve natural language ambiguities: A unified approach. In Proceedings of the National Conference on Artificial Intelligence, pages 806- 813, 1998.

[25] D. Roth, M.-H. Yang, and N. Ahuja. Learning to recognize objects. In CVPR'OO, The IEEE Conference on Computer Vision and Pattern Recognition, pages 724--731, 2000.

[26] L. G. Valiant. Projection learning. In Proceedings of the Conference on Computational Learning Theory, pages 287- 293, 1998.