nips nips2013 nips2013-97 nips2013-97-reference knowledge-graph by maker-knowledge-mining

97 nips-2013-Distributed Submodular Maximization: Identifying Representative Elements in Massive Data

Source: pdf

Author: Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause

Abstract: Many large-scale machine learning problems (such as clustering, non-parametric learning, kernel machines, etc.) require selecting, out of a massive data set, a manageable yet representative subset. Such problems can often be reduced to maximizing a submodular set function subject to cardinality constraints. Classical approaches require centralized access to the full data set; but for truly large-scale problems, rendering the data centrally is often impractical. In this paper, we consider the problem of submodular function maximization in a distributed fashion. We develop a simple, two-stage protocol G REE D I, that is easily implemented using MapReduce style computations. We theoretically analyze our approach, and show, that under certain natural conditions, performance close to the (impractical) centralized approach can be achieved. In our extensive experiments, we demonstrate the effectiveness of our approach on several applications, including sparse Gaussian process inference and exemplar-based clustering, on tens of millions of data points using Hadoop. 1

reference text

[1] Delbert Dueck and Brendan J. Frey. Non-metric afﬁnity propagation for unsupervised image categorization. In ICCV, 2007.

[2] Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). 2006.

[3] Hui Lin and Jeff Bilmes. A class of submodular functions for document summarization. In North American chapter of the Assoc. for Comp. Linguistics/Human Lang. Tech., 2011.

[4] Ryan Gomes and Andreas Krause. Budgeted nonparametric learning from data streams. In Proc. International Conference on Machine Learning (ICML), 2010.

[5] Andreas Krause and Daniel Golovin. Submodular function maximization. In Tractability: Practical Approaches to Hard Problems. Cambridge University Press, 2013. ´

[6] David Kempe, Jon Kleinberg, and Eva Tardos. Maximizing the spread of inﬂuence through a social network. In Proceedings of the ninth ACM SIGKDD, 2003.

[7] Andreas Krause and Carlos Guestrin. Submodularity and its applications in optimized information gathering. ACM Transactions on Intelligent Systems and Technology, 2011.

[8] Andrew Guillory and Jeff Bilmes. Active semi-supervised learning using submodular functions. In Uncertainty in Artiﬁcial Intelligence (UAI), Barcelona, Spain, July 2011. AUAI.

[9] Daniel Golovin and Andreas Krause. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artiﬁcial Intelligence Research, 2011.

[10] George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. An analysis of approximations for maximizing submodular set functions - I. Mathematical Programming, 1978.

[11] G. L. Nemhauser and L. A. Wolsey. Best algorithms for approximating the maximum of a submodular set function. Math. Oper. Research, 1978.

[12] Uriel Feige. A threshold of ln n for approximating set cover. Journal of the ACM, 1998.

[13] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: Simpliﬁed data processing on large clusters. In OSDI, 2004.

[14] Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, and Andrew Y. Ng. Mapreduce for machine learning on multicore. In NIPS, 2006.

[15] Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox. Mapreduce for data intensive scientiﬁc analyses. In Proc. of the 4th IEEE Inter. Conf. on eScience.

[16] Daniel Golovin, Matthew Faulkner, and Andreas Krause. Online distributed sensor selection. In IPSN, 2010.

[17] Graham Cormode, Howard Karloff, and Anthony Wirth. Set cover algorithms for very large datasets. In Proc. of the 19th ACM intern. conf. on Inf. knowl. manag.

[18] Flavio Chierichetti, Ravi Kumar, and Andrew Tomkins. Max-cover in map-reduce. In Proceedings of the 19th international conference on World wide web, 2010.

[19] Guy E. Blelloch, Richard Peng, and Kanat Tangwongsan. Linear-work greedy parallel approximate set cover and variants. In SPAA, 2011.

[20] Silvio Lattanzi, Benjamin Moseley, Siddharth Suri, and Sergei Vassilvitskii. Filtering: a method for solving graph problems in mapreduce. In SPAA, 2011.

[21] A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. In Proc. of Uncertainty in Artiﬁcial Intelligence (UAI), 2005.

[22] M. Minoux. Accelerated greedy algorithms for maximizing submodular set functions. Optimization Techniques, LNCS, pages 234–243, 1978.

[23] Leonard Kaufman and Peter J Rousseeuw. Finding groups in data: an introduction to cluster analysis, volume 344. Wiley-Interscience, 2009.

[24] Antonio Torralba, Rob Fergus, and William T Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell., 2008.

[25] Athanasios Tsanas, Max A Little, Patrick E McSharry, and Lorraine O Ramig. Enhanced classical dysphonia measures and sparse regression for telemonitoring of parkinson’s disease progression. In IEEE Int. Conf. Acoust. Speech Signal Process., 2010.

[26] Yahoo! academic relations. r6a, yahoo! front page today module user click log dataset, version 1.0, 2012.

[27] Tore Opsahl and Pietro Panzarasa. Clustering in weighted networks. Social networks, 2009. 9