jmlr jmlr2007 jmlr2007-43 jmlr2007-43-reference knowledge-graph by maker-knowledge-mining

43 jmlr-2007-Integrating Naïve Bayes and FOIL

Source: pdf

Author: Niels Landwehr, Kristian Kersting, Luc De Raedt

Abstract: A novel relational learning approach that tightly integrates the na¨ve Bayes learning scheme with ı the inductive logic programming rule-learner FOIL is presented. In contrast to previous combinations that have employed na¨ve Bayes only for post-processing the rule sets, the presented approach ı employs the na¨ve Bayes criterion to guide its search directly. The proposed technique is impleı mented in the N FOIL and T FOIL systems, which employ standard na¨ve Bayes and tree augmented ı na¨ve Bayes models respectively. We show that these integrated approaches to probabilistic model ı and rule learning outperform post-processing approaches. They also yield signiﬁcantly more accurate models than simple rule learning and are competitive with more sophisticated ILP systems. Keywords: rule learning, na¨ve Bayes, statistical relational learning, inductive logic programming ı

reference text

ˇ Hendrik Blockeel and Luc De Raedt. Lookahead and discretization in ILP. In N. Lavra c and S. Dˇ eroski, editors, Proceedings of the Seventh International Workshop on Inductive Logic z Programming (ILP-1997), volume 1297 of Lecture Notes in Computer Science, pages 77–84. Springer, 1997. Mark Craven and Se´ n Slattery. Relational learning with statistical predicate invention: Better a models for hypertext. Machine Learning, 43(1–2):97–119, 2001. 504 I NTEGRATING NA¨VE BAYES AND FOIL I Jesse Davis, Irene M. Ong V´tor Santos Costa, David Page, and Inˆ s Dutra. Using Bayesian classiı e ﬁers to combine rules. In Working Notes of the Third Workshop on Multi-Relational Data Mining (MRDM-2004) in conjunction with the Tenth ACM SIGKDD International Conference on Knowlege Discovery and Data Mining (KDD-2004), Seattle, Washington, USA, 2004. Jesse Davis, Elizabeth Burnside, Inˆ s de Castro Dutra, David Page, and V´tor Santos Costa. An e ı integrated approach to learning Bayesian networks of rules. In Joao Gama, Rui Camacho, Pavel Brazdil, Al´pio Jorge, and Lu´s Torgo, editors, Proceedings of the Sixteenth European Conference ı ı on Machine Learning (ECML-2005), volume 3720 of Lecture Notes in Computer Science, pages 84–95. Springer, 2005. Luc De Raedt and Kristian Kersting. Probabilistic logic learning. ACM-SIGKDD Explorations, 5 (1):31–48, 2003. Luc De Raedt and Kristian Kersting. Probabilistic inductive logic programming. In S. Ben-David, J. Case, and A. Maruoka, editors, Proceedings of the Fifteenth International Conference on Algorithmic Learning Theory (ALT-2004), volume 3244 of Lecture Notes in Computer Science, pages 19–36. Springer, 2004. Luc De Raedt and Jan Ramon. Condensed representations for inductive logic programming. In Proceedings of the Ninth International Conference on Principles of Knowledge Representation and Reasoning (KR-2004), Whistler, Canada, 2004. AAAI Press. Luc Dehaspe. Maximum entropy modeling with clausal constraints. In N. Lavraˇ and S. Dˇ eroski, c z editors, Proceedings of the Seventh International Workshop on Inductive Logic Programming (ILP-1997), volume 1297 of Lecture Notes in Computer Science, pages 109–124. Springer, 1997. Luc Dehaspe, Hannu Toivonen, and Ross D. King. Finding frequent substructures in chemical compounds. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-1998), New York City, New York, USA, 1998. AAAI Press. Saso Dˇ eroski, Steffen Schulze-Kremer, Karsten Heidtke, Karsten Siems, Dietrich Wettschereck, z and Hendrik Blockeel. Diterpene structure elucidation from 13 c NMR spectra with inductive logic programming. Applied Artiﬁcial Intelligence, Special Issue on First-Order Knowledge Discovery in Databases, 12:363–383, 1998. Hong Fang, Weida Tong, Leming M. Shi, Robert Blair, Roger Perkins, William Branham, Bruce S. Hass, Qian Xie, Stacy L. Dial, Carrie L. Moland, and Daniel M. Sheehan. Structure-activity relationships for a large diverse set of natural, synthetic, and environmental estrogens. Chemical Research in Toxicology, 14(3):280–294, 2001. Tom Fawcett. ROC graphs: Notes and practical considerations for data mining researchers, 2003. Peter Flach and Nicholas Lachiche. Naive Bayesian classiﬁcation of structured data. Machine Learning, 57(3):233–269, 2004. Nir Friedman and Moises Goldszmidt. Building classiﬁers using Bayesian networks. In Proceedings of the Thirteenth National Conference on Artiﬁcial Intelligence (AAAI-1996), Vol. 2, pages 1277– 1284, Portland, Oregon, USA, 1996. AAAI Press / The MIT Press. 505 L ANDWEHR , K ERSTING AND D E R AEDT Lise Getoor and Ben Taskar, editors. Statistical Relational Learning. MIT Press, 2007. In press. Lise Getoor, Nir Friedman, Daphne Koller, and Avi Pfeffer. Learning probabilistic relational models. In Relational Data Mining. Springer, 2001. Daniel Grossman and Peter Domingos. Learning Bayesian network classiﬁers by maximizing conditional likelihood. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML-2004), pages 361–368, Banff, Canada, 2004. ACM Press. Ross D. King, Ashvin Srinivasan, and Michael J, E Sternberg. Relating chemical activity to structure: An examination of ILP successes. New Generation Computing, 13(2,4):411–433, 1995. Mark A. Krogel and Stefan Wrobel. Transformation-based learning using multirelational aggregation. In C´ line Rouveirol and Mich` le Sebag, editors, Proceedings of the Eleventh International e e Conference on Inductive Logic Programming (ILP-2001), volume 2157 of Lecture Notes in Computer Science. Springer, 2001. Niels Landwehr, Kristian Kersting, and Luc De Raedt. nFOIL: Integrating na¨ve Bayes and FOIL. In ı Proceedings of the Twentieth National Conference on Artiﬁcial Intelligence (AAAI-2005), pages 795–800, Pittsburgh, Pennsylvania, USA, 2005. AAAI Press. Nada Lavraˇ and Saso Dˇ eroski. Inductive Logic Programming. Ellis Horwood, 1994. c z Stephen Muggleton. Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13:245–286, 1995. Stephen Muggleton. Stochastic logic programs. In Advances in Inductive Logic Programming. IOS Press, 1996. Stephen Muggleton and Luc De Raedt. Inductive logic programming: Theory and methods. Journal of Logic Programming, 19/20:629–679, 1994. Stephen Muggleton and Cao Feng. Efﬁcient induction of logic programs. In Proceedings of the First Conference on Algorithmic Learning Theory (ALT-1990), pages 368–381, Tokyo, Japan, 1990. Springer. Jennifer Neville, David Jensen, and Brian Gallagher. Simple estimators for relational Bayesian classiﬁers. In Proceedings of the Third IEEE International Conference on Data Mining (ICDM2003), pages 609–612, Melbourne, Florida, USA, 2003. IEEE Computer Society. Claudia Perlich and Foster Provost. Distribution-based aggregation for relational learning with identiﬁer attributes. Machine Learning, 62:65–105, 2006. Uros Pompe and Igor Kononenko. Naive Bayesian classiﬁer within ILP-R. In Proceedings of the Fifth International Workshop on Inductive Logic Programming (ILP-1995), pages 417–436, Tokyo, Japan, 1995. Uros Pompe and Igor Kononenko. Probabilistic ﬁrst-order classiﬁcation. In N. Lavraˇ and c S. Dˇ eroski, editors, Proceedings of the Seventh International Workshop on Inductive Logic z Programming (ILP-1997), volume 1297 of Lecture Notes in Computer Science, pages 235–242. Springer, 1997. 506 I NTEGRATING NA¨VE BAYES AND FOIL I Alexandrin Popescul, Lyle H. Ungar, Steve Lawrence, and David M. Pennock. Statistical relational learning for document mining. In Proceedings of the Third IEEE International Conference on Data Mining (ICDM-2003), pages 275–282, Melbourne, Florida, USA, 2003. IEEE Computer Society. Foster J. Provost, Tom Fawcett, and Ron Kohavi. The case against accuracy estimation for comparing induction algorithms. In Proceeding of the Fifteenth International Conference on Machine Learning (ICML-1998), Madison, Wisconsin, USA, 1998. Morgan Kaufmann. J. Ross Quinlan. Learning logical deﬁnitions from relations. Machine Learning, pages 239–266, 1990. Jorma Rissanen. Modeling by shortest data description. Automatica, 14:465–471, 1978. Ashvin Srinivasan, Stephen Muggleton, Ross D. King, and Michael J. E. Sternberg. Theories for mutagenicity: A study of ﬁrst-order and feature based induction. Artiﬁcial Intelligence, 85: 277–299, 1996. Ashwin Srinivasan, Ross D. King, and Douglas W. Bristol. An assessment of ILP-assisted models for toxicology and the PTE-3 experiment. In Saso Dzeroski and Peter A. Flach, editors, Proceedings of the Ninth Internatinal Workshop on Inductive Logic Programming (ILP-1999), volume 1634 of Lecture Notes in Computer Science. Springer, 1999. Ben Taskar, Eran Segal, and Daphne Koller. Probabilistic clustering in relational data. In Proceedings of the Seventeenth International Joint Conference on Artiﬁcial Intelligence (IJCAI-2001), pages 870–878, Seattle, Washington, USA, 2001. Morgan Kaufmann. Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implemenations. Morgan Kaufmann, 2000. 507