nips nips2010 nips2010-83 nips2010-83-reference knowledge-graph by maker-knowledge-mining

83 nips-2010-Evidence-Specific Structures for Rich Tractable CRFs


Source: pdf

Author: Anton Chechetka, Carlos Guestrin

Abstract: We present a simple and effective approach to learning tractable conditional random fields with structure that depends on the evidence. Our approach retains the advantages of tractable discriminative models, namely efficient exact inference and arbitrarily accurate parameter learning in polynomial time. At the same time, our algorithm does not suffer a large expressive power penalty inherent to fixed tractable structures. On real-life relational datasets, our approach matches or exceeds state of the art accuracy of the dense models, and at the same time provides an order of magnitude speedup. 1


reference text

[1] J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML, 2001.

[2] M. Schmidt, K. Murphy, G. Fung, and R. Rosales. Structure learning in random fields for heart motion abnormality detection. In CVPR, 2008.

[3] D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. 2009.

[4] J. K. Bradley and C. Guestrin. Learning tree conditional random fields. In ICML, to appear, 2010.

[5] D. Roth. On the hardness of approximate reasoning. Artificial Intelligence, 82(1-2), 1996.

[6] C. Sutton and A. McCallum. Piecewise pseudolikelihood for efficient CRF training. In ICML, 2007.

[7] C. Chow and C. Liu. Approximating discrete probability distributions with dependence trees. IEEE Trans. on Inf. Theory, 14(3), 1968.

[8] D. Karger and N. Srebro. Learning Markov networks: Maximum bounded tree-width graphs. In SODA’01.

[9] D. C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(3), 1989.

[10] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. 1988.

[11] J. S. Yedidia, W. T. Freeman, and Y. Weiss. Generalized belief propagation. In NIPS, 2000.

[12] S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-6(6), 1984.

[13] S. Arnborg, D. G. Corneil, and A. Proskurowski. Complexity of finding embeddings in a k-tree. SIAM Journal on Algebraic and Discrete Methods, 8(2), 1987.

[14] W. H¨ rdle, M. M¨ ller, S. Sperlich, and A. Werwatz. Nonparametric and Semiparametric Models. 2004. a u

[15] D. Shahaf, A. Chechetka, and C. Guestrin. Learning thin junction trees via graph cuts. In AISTATS, 2009.

[16] L. Getoor and B. Taskar. Introduction to Statistical Relational Learning. The MIT Press, 2007.

[17] A. Deshpande, C. Guestrin, S. Madden, J. Hellerstein, and W. Hong. Model-driven data acquisition in sensor networks. In VLDB, 2004.

[18] A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. In UAI’05.

[19] A. Chechetka, D. Dash, and M. Philipose. Relational learning for collective classification of entities in images. In AAAI Workshop on Statistical Relational AI, 2010.

[20] M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62(1-2), 2006.

[21] S. Kok, M. Sumner, M. Richardson, P. Singla, H. Poon, D. Lowd, and P. Domingos. The alchemy system for statistical relational AI. Technical report, University of Washington, Seattle, WA., 2009.

[22] J. Gonzalez, Y. Low, and C. Guestrin. Residual splash for optimally parallelizing belief propagation. In AISTATS, 2009.

[23] B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In UAI, 2002.

[24] B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. In NIPS, 2003.

[25] C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller. Context-specific independence in Bayesian networks. In UAI, 1996.

[26] M. desJardins, P. Rathod, and L. Getoor. Bayesian network learning with abstraction hierarchies and context-specific independence. In ECML, 2005.

[27] B. Thiesson, C. Meek, D. Chickering, and D. Heckerman. Learning mixtures of DAG models. In UAI’97.

[28] M. Meil˘ and M. I. Jordan. Learning with mixtures of trees. JMLR, 1, 2001. a

[29] A. J. Viterbi. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, IT-13, 1967.

[30] S. L. Lauritzen. The EM algorithm for graphical association models with missing data. Computational Statistics & Data Analysis, 19(2), 1995.

[31] A. Torralba, K. P. Murphy, and W. T. Freeman. Contextual models for object detection using boosted random fields. In NIPS, 2004.

[32] C. Sutton and A. McCallum. Collective segmentation and labeling of distant entities in information extraction. In ICML Workshop on Statistical Relational Learning and Its Connections, 2004. 9