jmlr jmlr2007 jmlr2007-72 jmlr2007-72-reference knowledge-graph by maker-knowledge-mining

72 jmlr-2007-Relational Dependency Networks

Source: pdf

Author: Jennifer Neville, David Jensen

Abstract: Recent work on graphical models for relational data has demonstrated signiﬁcant improvements in classiﬁcation and inference when models represent the dependencies among instances. Despite its use in conventional statistical models, the assumption of instance independence is contradicted by most relational data sets. For example, in citation data there are dependencies among the topics of a paper’s references, and in genomic data there are dependencies among the functions of interacting proteins. In this paper, we present relational dependency networks (RDNs), graphical models that are capable of expressing and reasoning with such dependencies in a relational setting. We discuss RDNs in the context of relational Bayes networks and relational Markov networks and outline the relative strengths of RDNs—namely, the ability to represent cyclic dependencies, simple methods for parameter estimation, and efﬁcient structure learning techniques. The strengths of RDNs are due to the use of pseudolikelihood learning techniques, which estimate an efﬁcient approximation of the full joint distribution. We present learned RDNs for a number of real-world data sets and evaluate the models in a prediction context, showing that RDNs identify and exploit cyclic relational dependencies to achieve signiﬁcant performance gains over conventional conditional models. In addition, we use synthetic data to explore model performance under various relational data characteristics, showing that RDN learning and inference techniques are accurate over a wide range of conditions. Keywords: relational learning, probabilistic relational models, knowledge discovery, graphical models, dependency networks, pseudolikelihood estimation

reference text

A. Bernstein, S. Clearwater, and F. Provost. The relational vector-space model and industry classiﬁcation. In Proceedings of the IJCAI-2003 Workshop on Learning Statistical Models from Relational Data, pages 8–18, 2003. J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society, Series B, 36(2):192–236, 1974. J. Besag. Statistical analysis of non-lattice data. The Statistician, 24(3):179–195, 1975. H. Blau, N. Immerman, and D. Jensen. A visual query language for relational knowledge discovery. Technical Report 01-28, University of Massachusetts Amherst, Computer Science Department, 2001. G. Casella and R. Berger. Statistical Inference. Duxbury, 2002. S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 307–318, 1998. F. Comets. On consistency of a class of estimators for exponential families of Markov random ﬁelds on the lattice. The Annals of Statistics, 20(1):455–468, 1992. C. Cortes, D. Pregibon, and C. Volinsky. Communities of interest. In Proceedings of the 4th International Symposium of Intelligent Data Analysis, pages 105–114, 2001. M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery. Learning to extract symbolic knowledge from the World Wide Web. In Proceedings of the 15th National Conference on Artiﬁcial Intelligence, pages 509–516, 1998. P. Domingos and M. Richardson. Mining the network value of customers. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 57–66, 2001. T. Fawcett and F. Provost. Adaptive fraud detection. Data Mining and Knowledge Discovery, 1(3): 291–316, 1997. P. Flach and N. Lachiche. 1BC: A ﬁrst-order Bayesian classiﬁer. In Proceedings of the 9th International Conference on Inductive Logic Programming, pages 92–103, 1999. N. Friedman, L. Getoor, D. Koller, and A. Pfeffer. Learning probabilistic relational models. In Proceedings of the 16th International Joint Conference on Artiﬁcial Intelligence, pages 1300– 1309, 1999. S. Geman and C. Grafﬁne. Markov random ﬁeld image models and their applications to computer vision. In Proceedings of the 1986 International Congress of Mathematicians, pages 1496–1517, 1987. L. Getoor, N. Friedman, D. Koller, and A. Pfeffer. Learning probabilistic relational models. In Relational Data Mining, pages 307–335. Springer-Verlag, 2001. 689 N EVILLE AND J ENSEN B. Gidas. Consistency of maximum likelihood and pseudolikelihood estimators for Gibbs distributions. In Proceedings of the Workshop on Stochastic Differential Systems with Applications in Electrical/Computer Engineering, Control Theory, and Operations Research, pages 129–145, 1986. D. Heckerman, D. Chickering, C. Meek, R. Rounthwaite, and C. Kadie. Dependency networks for inference, collaborative ﬁltering and data visualization. Journal of Machine Learning Research, 1:49–75, 2000. D. Heckerman, C. Meek, and D. Koller. Probabilistic models for relational data. Technical Report MSR-TR-2004-30, Microsoft Research, 2004. S. Hill, F. Provost, and C. Volinsky. Network-based marketing: Identifying likely adopters via consumer networks. Statistical Science, 22(2), 2006. M. Jaeger. Relational Bayesian networks. In Proceedings of the 13th Conference on Uncertainty in Artiﬁcial Intelligence, pages 266–273, 1997. D. Jensen and J. Neville. Linkage and autocorrelation cause feature selection bias in relational learning. In Proceedings of the 19th International Conference on Machine Learning, pages 259– 266, 2002. D. Jensen and J. Neville. Avoiding bias when aggregating relational data with degree disparity. In Proceedings of the 20th International Conference on Machine Learning, pages 274–281, 2003. D. Jensen, J. Neville, and B. Gallagher. Why collective inference improves relational classiﬁcation. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 593–598, 2004. K. Kersting. Representational power of probabilistic-logical models: From upgrading to downgrading. In IJCAI-2003 Workshop on Learning Statistical Models from Relational Data, pages 61–62, 2003. K. Kersting and L. De Raedt. Basic principles of learning Bayesian logic programs. Technical Report 174, Institute for Computer Science, University of Freiburg, 2002. S. Kok and P. Domingos. Learning the structure of Markov logic networks. In Proceedings of the 22nd International Conference on Machine Learning, pages 441–448, 2005. J. Lafferty, A. McCallum, and F. Pereira. Conditional random ﬁelds: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning, pages 282–289, 2001. S. Lauritzen and N. Sheehan. Graphical models for genetic analyses. Statistical Science, 18(4): 489–514, 2003. E. Lehmann and G. Casella. Theory of Point Estimation. Springer-Verlag, New York, 1998. Q. Lu and L. Getoor. Link-based classiﬁcation. In Proceedings of the 20th International Conference on Machine Learning, pages 496–503, 2003. 690 R ELATIONAL D EPENDENCY N ETWORKS S. Macskassy and F. Provost. A simple relational classiﬁer. In Proceedings of the 2nd Workshop on Multi-Relational Data Mining, KDD2003, pages 64–76, 2003. S. Macskassy and F. Provost. Classiﬁcation in networked data: A toolkit and a univariate case study. Technical Report CeDER-04-08, Stern School of Business, New York University, 2004. A. McCallum. Efﬁciently inducing features of conditional random ﬁelds. In Proceedings of the 19th Conference on Uncertainty in Artiﬁcial Intelligence, pages 403–410, 2003. A. McCallum, K. Nigam, J. Rennie, and K. Seymore. A machine learning approach to building domain-speciﬁc search engines. In Proceedings of the 16th International Joint Conference on Artiﬁcial Intelligence, pages 662–667, 1999. K. Murphy, Y. Weiss, and M. Jordan. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the 15th Conference on Uncertainty in Artiﬁcial Intelligence, pages 467–479, 1999. R. Neal. Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRGTR-93-1, Dept of Computer Science, University of Toronto, 1993. J. Neville, O. Simsek, D. Jensen, J. Komoroske, K. Palmer, and H. Goldberg. Using relational ¸ ¸ knowledge discovery to prevent securities fraud. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 449–458, 2005. J. Neville and D. Jensen. Iterative classiﬁcation in relational data. In Proceedings of the Workshop on Statistical Relational Learning, 17th National Conference on Artiﬁcial Intelligence, pages 42–49, 2000. J. Neville and D. Jensen. Supporting relational knowledge discovery: Lessons in architecture and algorithm design. In Proceedings of the Data Mining Lessons Learned Workshop, ICML2002, pages 57–64, 2002. J. Neville and D. Jensen. Collective classiﬁcation with relational dependency networks. In Proceedings of the 2nd Multi-Relational Data Mining Workshop, KDD2003, pages 77–91, 2003. J. Neville and D. Jensen. Dependency networks for relational data. In Proceedings of the 4th IEEE International Conference on Data Mining, pages 170–177, 2004. J. Neville and D. Jensen. Bias/variance analysis for network data. In Proceedings of the Workshop on Statistical Relational Learning, 23rd International Conference on Machine Learning, 2006. J. Neville, D. Jensen, L. Friedland, and M. Hay. Learning relational probability trees. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 625–630, 2003a. J. Neville, D. Jensen, and B. Gallagher. Simple estimators for relational Bayesian classifers. In Proceedings of the 3rd IEEE International Conference on Data Mining, pages 609–612, 2003b. C. Perlich and F. Provost. Aggregation-based feature invention and relational concept classes. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 167–176, 2003. 691 N EVILLE AND J ENSEN A. Popescul, L. Ungar, S. Lawrence, and D. Pennock. Statistical relational learning for document mining. In Proceedings of the 3rd IEEE International Conference on Data Mining, pages 275– 282, 2003. M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62:107–136, 2006. S. Sanghai, P. Domingos, and D. Weld. Dynamic probabilistic relational models. In Proceedings of the 18th International Joint Conference on Artiﬁcial Intelligence, pages 992–1002, 2003. B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In Proceedings of the 18th Conference on Uncertainty in Artiﬁcial Intelligence, pages 485–492, 2002. B. Taskar, E. Segal, and D. Koller. Probabilistic classiﬁcation and clustering in relational data. In Proceedings of the 17th International Joint Conference on Artiﬁcial Intelligence, pages 870–878, 2001. H. White. Estimation, Inference and Speciﬁcation Analysis. Cambridge University Press, New York, 1994. B. Zadrozny and C. Elkan. Obtaining calibrated probability estimates from decision trees and naive Bayesian classiﬁers. In Proceedings of the 18th International Conference on Machine Learning, pages 609–616, 2001. 692