jmlr jmlr2010 jmlr2010-56 jmlr2010-56-reference knowledge-graph by maker-knowledge-mining

56 jmlr-2010-Introduction to Causal Inference

Source: pdf

Author: Peter Spirtes

Abstract: The goal of many sciences is to understand the mechanisms by which variables came to take on the values they have (that is, to ﬁnd a generative model), and to predict what the values of those variables would be if the naturally occurring mechanisms were subject to outside manipulations. The past 30 years has seen a number of conceptual developments that are partial solutions to the problem of causal inference from observational sample data or a mixture of observational sample and experimental data, particularly in the area of graphical causal modeling. However, in many domains, problems such as the large numbers of variables, small samples sizes, and possible presence of unmeasured causes, remain serious impediments to practical applications of these developments. The articles in the Special Topic on Causality address these and other problems in applying graphical causal modeling algorithms. This introduction to the Special Topic on Causality provides a brief introduction to graphical causal modeling, places the articles in a broader context, and describes the differences between causal inference and ordinary machine learning classiﬁcation and prediction problems. Keywords: Bayesian networks, causation, causal inference

reference text

Constantin Aliferis, Alexander Statnikov, Ionnis Tsamardinos, Subramani Mani, and Xenophon Koutsoukos. Local causal and Markov blanket induction for causal discovery and feature selection for classiﬁcation, Part I: Algorithms and empirical evaluation. Journal of Machine Learning Research, 11:171–234, 2010a. Constantin Aliferis, Alexander Statnikov, Ionnis Tsamardinos, Subramani Mani, and Xenophon Koutsoukos. Local causal and Markov blanket induction for causal discovery and feature selection for classiﬁcation, Part II: Analysis and extensions. Journal of Machine Learning Research, 11:235–284, 2010b. Kenneth A. Bollen. Structural Equations with Latent Variables. Wiley-Interscience, 1989. Facundo Bromberg and Dimitris Margaritis. Improving the reliability of causal discovery from small data sets using argumentation. Journal of Machine Learning Research, 10:301–340, 2009. David Maxwell Chickering. Optimal structure identiﬁcation with greedy search. Journal of Machine Learning Research, 3:507–554, 2002. Greg Cooper and Clark Glymour. Computation, Causation, and Discovery. AAAI Press, 1999. Gregory Cooper and Changwon Yoo. Causal discovery from a mixture of experimental and observational data. In Kathryn Laskey and Henri Prade, editors, Proceedings of the 15th Conference on Uncertainty in Artiﬁcial Intelligence, pages 116–125, San Francisco, CA, 1999. Morgan Kauffman. David Cox and Nanny Wermuth. Multivariate Dependencies: Models, Analysis and Interpretation (Monographs on Statistics and Applied Probability). Chapman and Hall, 1996. Frederick Eberhardt, Richard Scheines, and Clark Glymour. On the number of experiments sufﬁcient and in the worst case necessary to identify all causal relations among n variables. In Fahiem Bacchus and Tommi Jaakkola, editors, Proceedings of the 21st Conference on Uncertainty in Artiﬁcial Intelligence, pages 178–184, Arlington, VA, 2005. AUAI Press. Ronald Fisher. The Design of Experiments. Macmillan Pub Co, 1971. Yang-Bo He and Zhi Geng. Active learning of causal networks with intervention experiments and optimal designs. Journal of Machine Learning Research, 10:2523–2547, 2009. David Heckerman and Dan Geiger. Learning Bayesian networks: a uniﬁcation for discrete and Gaussian domains. In Philippe Besnard and Steve Hanks, editors, Proceedings of the 11th Conference on Uncertainty in Artiﬁcial Intelligence, pages 274–282. Morgan Kaufman, 1995. David Heckerman, Chris Meek, and Gregory Cooper. A Bayesian approach to causal discovery. In Greg Cooper and Clark Glymour, editors, Computation, Causation, and Discovery, pages 141– 165. MIT Press, Cambridge, MA, 1999. Jennifer Hoeting, David Madigan, Adrian Raftery, and Chris Volinsky. Bayesian model averaging: A tutorial. Statistical Science, 14(4):382–401, 1999. 1660 I NTRODUCTION TO C AUSAL I NFERENCE Yimin Huang and Marco Valtorta. Identiﬁability in causal Bayesian networks: A sound and complete algorithm. In Proceedings of the Twenty-First National Conference on Artiﬁcial Intelligence, pages 1149–1154, Edinboro, Scotland, 2006. AAAI Press. Markus Kalisch and Peter Buhlmann. Estimating high dimensional directed acyclic graphs with the PC algorithm. Journal of Machine Learning Research, 8:613–636, 2007. Changsung Kang and Jin Tian. Markov properties for linear causal models with correlated errors. Journal of Machine Learning Research, 10:41–70, 2009. Jan Koster. On the validity of the Markov interpretation of path diagrams of Gaussian structural equation models with correlated errors. Scandinavian Journal of Statistics, pages 413–431, 1999. Steffen Lauritzen. Causal inference from graphical models. In D. Barnsdorf-Nielsen and C. Kluppenberg, editors, Complex Stochastic Systems, pages 141–165. Chapman and Hall, Baton Rouge, LA, 1999. Steffen Lauritzen, Phil Dawid, B. Larsen, and H. Leimer. Independence properties of directed Markov ﬁelds. Networks, 20:491–505, 1990. Marloes Maathuis, Markus Kalisch, and Peter Buhlmann. Estimating high-dimensional intervention effects from observational data. Annals of Statistics, 37(6A):3133–3164, 2009. Chris Meek. Strong completeness and faithfulness in Bayesian networks. In Phillipe Besnard and Steve Hanks, editors, Proceedings of the 11th Conference on Uncertainty in Artiﬁcial Intelligence, pages 411–419, Montreal, Quebec, 1995. Morgan Kaufman. Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988. Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):669–688, 1995. Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000. Jean-Philippe Pellet and Andre Elisseeff. Using Markov blankets for causal structure learning. Journal of Machine Learning Research, 9:1295–1342, 2008. Joseph Ramsey, Peter Spirtes, and Jiji Zhang. Adjacency-faithfulness and conservative causal inference. In Rina Dechter and Thomas Richardson, editors, Proceedings of the 22nd Conference on Uncertainty in Artiﬁcial Intelligence, pages 401–408, Cambridge, MA, 2006. AUAI Press. Thomas Richardson. A discovery algorithm for directed cyclic graphs. In Eric Horvitz and Finn Jensen, editors, Proceedings of the 12th Conference on Uncertainty in Artiﬁcial Intelligence, pages 454–462, Cambridge, MA, 1996. Morgan Kaufmann. Thomas Richardson. Markov properties for acyclic directed mixed graphs. Scandinavian Journal of Statistics, 30:145–157, 2003. James Robins, Richard Scheines, Peter Spirtes, and Larry Wasserman. Uniform consistency in causal inference. Biometrika, 90(3):491–515, 2003. 1661 S PIRTES R. Rodgers and C. Maranto. Causal-models of publishing productivity in psychology. J Appl Psychol, 74(4):636–649, 1989. Shohei Shimizu, Aapo Hyvarinen, Patrick Hoyer, and Yutaku Kano. Finding a causal ordering via independent component analysis. Comput Stat Data An, 50(11):3278–3293, 2006. Ilya Shpitser and Judea Pearl. Identiﬁcation of conditional intervention distributions. In Rina Dechter and Thomas Richardson, editors, Proceedings of the 22nd Conference on Uncertainty in Artiﬁcial Intelligence, pages 437–444, Cambridge, MA, 2006a. AUAI Press. Ilya Shpitser and Judea Pearl. Identiﬁcation of joint interventional distributions in recursive semiMarkovian causal models. In Proceedings of the Twenty-First National Conference on Artiﬁcial Intelligence, pages 1219–1226, Menlo Park, California, 2006b. AAAI Press. Ilya Shpitser and Judea Pearl. Complete identiﬁcation methods for the causal hierarchy. Journal of Machine Learning Research, 9:1941–1979, 2008. Ilya Shpitser, Thomas Richardson, and James Robins. Testing edges by truncation. In Proceedings of the 21st International Joint Conference on Artiﬁcial Intelligence, pages 1957–1963. AAAI Press, 2009. Ricardo Silva, Richard Scheines, Clark Glymour, and Peter Spirtes. Learning the structure of linear latent variable models. Journal of Machine Learning Research, 7:191–246, 2006. Peter Spirtes. Directed cyclic graphical representations of feedback models. In Phillipe Besnard and Steve Hanks, editors, Proceedings of the 11th Conference on Uncertainty in Artiﬁcial Intelligence, pages 491–499, Montreal, Canada, 1995. Morgan Kaufmann. Peter Spirtes and Clark Glymour. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review, 9(1):67–72, 1991. Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. SpringVerlag Lectures in Statistics, 1993. Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search, Second Edition (Adaptive Computation and Machine Learning). The MIT Press, 2001. Robert Tillman, Arthur Gretton, and Peter Spirtes. Nonlinear directed acyclic structure learning with weakly additive noise models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Proceedings of Advances in Neural Processing Information Systems 22, pages 1847–1855, Vancouver, BC, 2009. Curran Associates, Inc. Changwon Yoo and Gregory Cooper. An evaluation of a system that recommends microarray experiments to perform to discover gene-regulation pathways. Artiﬁcial Intelligence in Medicine, 31(2):169–182, 2004. Jiji Zhang. Causal reasoning with ancestral graphs. Journal of Machine Learning Research, 9: 1437–1474, 2008. 1662