nips nips2012 nips2012-236 nips2012-236-reference knowledge-graph by maker-knowledge-mining

236 nips-2012-Near-Optimal MAP Inference for Determinantal Point Processes

Source: pdf

Author: Jennifer Gillenwater, Alex Kulesza, Ben Taskar

Abstract: Determinantal point processes (DPPs) have recently been proposed as computationally efﬁcient probabilistic models of diverse sets for a variety of applications, including document summarization, image search, and pose estimation. Many DPP inference operations, including normalization and sampling, are tractable; however, ﬁnding the most likely conﬁguration (MAP), which is often required in practice for decoding, is NP-hard, so we must resort to approximate inference. This optimization problem, which also arises in experimental design and sensor placement, involves ﬁnding the largest principal minor of a positive semideﬁnite matrix. Because the objective is log-submodular, greedy algorithms have been used in the past with some empirical success; however, these methods only give approximation guarantees in the special case of monotone objectives, which correspond to a restricted class of DPPs. In this paper we propose a new algorithm for approximating the MAP problem based on continuous techniques for submodular function maximization. Our method involves a novel continuous relaxation of the log-probability function, which, in contrast to the multilinear extension used for general submodular functions, can be evaluated and differentiated exactly and efﬁciently. We obtain a practical algorithm with a 1/4-approximation guarantee for a more general class of non-monotone DPPs; our algorithm also extends to MAP inference under complex polytope constraints, making it possible to combine DPPs with Markov random ﬁelds, weighted matchings, and other models. We demonstrate that our approach outperforms standard and recent methods on both synthetic and real-world data. 1

reference text

[1] A. Nenkova, L. Vanderwende, and K. McKeown. A Compositional Context-Sensitive Multi-Document Summarizer: Exploring the Factors that Inﬂuence Summarization. In Proc. SIGIR, 2006.

[2] H. Lin and J. Bilmes. Multi-document Summarization via Budgeted Maximization of Submodular Functions. In Proc. NAACL/HLT, 2010.

[3] F. Radlinski, R. Kleinberg, and T. Joachims. Learning Diverse Rankings with Multi-Armed Bandits. In Proc. ICML, 2008.

[4] Y. Yue and T. Joachims. Predicting Diverse Subsets Using Structural SVMs. In Proc. ICML, 2008.

[5] A. Kulesza and B. Taskar. k-DPPs: Fixed-Size Determinantal Point Processes. In Proc. ICML, 2011.

[6] C. Guestrin, A. Krause, and A. Singh. Near-Optimal Sensor Placements in Gaussian Processes. In Proc. ICML, 2005.

[7] D. Kempe, J. Kleinberg, and E. Tardos. Inﬂuential Nodes in a Diffusion Model for Social Networks. In Automata, Languages and Programming, volume 3580 of Lecture Notes in Computer Science. 2005.

[8] O. Macchi. The Coincidence Approach to Stochastic Point Processes. Advances in Applied Probability, 7(1), 1975.

[9] D. Daley and D. Vere-Jones. An Introduction to the Theory of Point Processes: Elementary Theory and Methods. 2003.

[10] A. Kulesza and B Taskar. Structured Determinantal Point Processes. In Proc. NIPS, 2010.

[11] A. Kulesza, J. Gillenwater, and B. Taskar. Discovering Diverse and Salient Threads in Document Collections. In Proc. EMNLP, 2012.

[12] J. Hough, M. Krishnapur, Y. Peres, and B. Vir´ g. Determinantal Processes and Independence. Probability a Surveys, 3, 2006.

[13] A. Kulesza and B. Taskar. Learning Determinantal Point Processes. In Proc. UAI, 2011.

[14] C. Ko, J. Lee, and M. Queyranne. An Exact Algorithm for Maximum Entropy Sampling. Operations Research, 43(4), 1995.

[15] G. Nemhauser, L. Wolsey, and M. Fisher. An Analysis of Approximations for Maximizing Submodular Set Functions I. Mathematical Programming, 14(1), 1978.

[16] U. Feige, V. Mirrokni, and J. Vondrak. Maximizing Non-Monotone Submodular Functions. In Proc. FOCS, 2007.

[17] T. Robertazzi and S. Schwartz. An Accelerated Sequential Algorithm for Producing D-optimal Designs. SIAM J. Sci. Stat. Comput., 10(2), 1989.

[18] N. Buchbinder, M. Feldman, J. Naor, and R. Schwartz. A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization. In Proc. FOCS, 2012.

[19] A. Gupta, A. Roth, G. Schoenebeck, and K. Talwar. Constrained Nonmonotone Submodular Maximization: Ofﬂine and Secretary Algorithms. In Internet and Network Economics, volume 6484 of LNCS. 2010.

[20] S. Gharan and J. Vondr´ k. Submodular Maximization by Simulated Annealing. In Proc. Soda, 2011. a

[21] C. Chekuri, J. Vondr´ k, and R. Zenklusen. Submodular Function Maximization via the Multilinear Rea laxation and Contention Resolution Schemes. arXiv:1105.4593, 2011.

[22] M. Feldman, J. Naor, and R. Schwartz. Nonmonotone Submodular Maximization via a Structural Continuous Greedy Algorithm. Automata, Languages and Programming, 2011.

[23] A. Borodin and A. Soshnikov. Janossy Densities I. Determinantal Ensembles. Journal of Statistical Physics, 113(3), 2003.

[24] A. Borodin. Determinantal Point Processes. arXiv:0911.1153, 2009.

[25] M. Feldman, J. Naor, and R. Schwartz. A Uniﬁed Continuous Greedy Algorithm for Submodular Maximization. In Proc. FOCS, 2011.

[26] D. Bertsekas. Nonlinear Programming. Athena Scientiﬁc, 1999.

[27] M. Frank and P. Wolfe. An Algorithm for Quadratic Programming. Naval Research Logistics Quarterly, 3(1-2), 1956.

[28] A. Ageev and M. Sviridenko. Pipage Rounding: A New Method of Constructing Algorithms with Proven Performance Guarantee. Journal of Combinatorial Optimization, 8(3), 2004.

[29] A. James. Distributions of Matrix Variates and Latent Roots Derived from Normal Samples. Annals of Mathematical Statistics, 35(2), 1964.

[30] P. Hsu. On the Distribution of Roots of Certain Determinantal Equations. Annals of Eugenics, 9(3), 1939. 9