nips nips2013 nips2013-107 nips2013-107-reference knowledge-graph by maker-knowledge-mining

107 nips-2013-Embed and Project: Discrete Sampling with Universal Hashing


Source: pdf

Author: Stefano Ermon, Carla P. Gomes, Ashish Sabharwal, Bart Selman

Abstract: We consider the problem of sampling from a probability distribution defined over a high-dimensional discrete set, specified for instance by a graphical model. We propose a sampling algorithm, called PAWS, based on embedding the set into a higher-dimensional space which is then randomly projected using universal hash functions to a lower-dimensional subspace and explored using combinatorial search methods. Our scheme can leverage fast combinatorial optimization tools as a blackbox and, unlike MCMC methods, samples produced are guaranteed to be within an (arbitrarily small) constant factor of the true probability distribution. We demonstrate that by using state-of-the-art combinatorial search tools, PAWS can efficiently sample from Ising grids with strong interactions and from software verification instances, while MCMC and variational methods fail in both cases. 1


reference text

[1] N.N. Madras. Lectures on Monte Carlo Methods. American Mathematical Society, 2002. ISBN 0821829785.

[2] M. Jerrum and A. Sinclair. The Markov chain Monte Carlo method: an approach to approximate counting and integration. Approximation algorithms for NP-hard problems, pages 482– 520, 1997.

[3] Mihir Bellare, Oded Goldreich, and Erez Petrank. Uniform generation of NP-witnesses using an NP-oracle. Information and Computation, 163(2):510–526, 2000.

[4] Stefano Ermon, Carla P. Gomes, and Bart Selman. Uniform solution sampling using a constraint solver as an oracle. In UAI, pages 255–264, 2012.

[5] C.P. Gomes, A. Sabharwal, and B. Selman. Near-uniform sampling of combinatorial spaces using XOR constraints. In NIPS-2006, pages 481–488, 2006.

[6] S. Chakraborty, K. Meel, and M. Vardi. A scalable and nearly uniform generator of SAT witnesses. In CAV-2013, 2013.

[7] Vibhav Gogate and Pedro Domingos. Approximation by quantization. In UAI, pages 247–255, 2011.

[8] Radford M Neal. Slice sampling. Annals of statistics, pages 705–741, 2003.

[9] Stefano Ermon, Carla Gomes, Ashish Sabharwal, and Bart Selman. Taming the curse of dimensionality: Discrete integration by hashing and optimization. In ICML, 2013.

[10] M.J. Wainwright and M.I. Jordan. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1–305, 2008.

[11] S. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 2011.

[12] O. Goldreich. Randomized methods in computation. Lecture Notes, 2011.

[13] D. Allouche, S. de Givry, and T. Schiex. Toulbar2, an open source exact cost function network solver. Technical report, INRIA, 2010.

[14] IBM ILOG. IBM ILOG CPLEX Optimization Studio 12.3, 2011.

[15] Carla P. Gomes, Willem Jan van Hoeve, Ashish Sabharwal, and Bart Selman. Counting CSP solutions using generalized XOR constraints. In AAAI, 2007.

[16] Steffen L Lauritzen and David J Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society. Series B (Methodological), pages 157–224, 1988.

[17] Yehuda Naveh, Michal Rimon, Itai Jaeger, Yoav Katz, Michael Vinov, Eitan s Marcu, and Gil Shurek. Constraint-based random stimuli generation for hardware verification. AI magazine, 28(3):13, 2007.

[18] Clark Barrett, Aaron Stump, and Cesare Tinelli. The Satisfiability Modulo Theories Library (SMT-LIB). www.SMT-LIB.org, 2010.

[19] Patrice Godefroid, Michael Y Levin, David Molnar, et al. Automated whitebox fuzz testing. In NDSS, 2008.

[20] Patrice Godefroid, Michael Y. Levin, and David Molnar. Sage: Whitebox fuzzing for security testing. Queue, 10(1):20:20–20:27, January 2012. ISSN 1542-7730.

[21] M. Soos, K. Nohl, and C. Castelluccia. Extending SAT solvers to cryptographic problems. In SAT-2009. Springer, 2009.

[22] Robert J Bayardo and Joseph Daniel Pehoushek. Counting models using connected components. In AAAI-2000, pages 157–162, 2000. 9