nips nips2013 nips2013-319 knowledge-graph by maker-knowledge-mining

319 nips-2013-Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints


Source: pdf

Author: Rishabh K. Iyer, Jeff A. Bilmes

Abstract: We investigate two new optimization problems — minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). We are motivated by a number of real-world applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost). These problems are often posed as minimizing the difference between submodular functions [9, 25] which is in the worst case inapproximable. We show, however, that by phrasing these problems as constrained optimization, which is more natural for many applications, we achieve a number of bounded approximation guarantees. We also show that both these problems are closely related and an approximation algorithm solving one can be used to obtain an approximation guarantee for the other. We provide hardness results for both problems thus showing that our approximation factors are tight up to log-factors. Finally, we empirically demonstrate the performance and good scalability properties of our algorithms. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We investigate two new optimization problems — minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). [sent-5, score-3.019]

2 We are motivated by a number of real-world applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost). [sent-6, score-0.835]

3 These problems are often posed as minimizing the difference between submodular functions [9, 25] which is in the worst case inapproximable. [sent-7, score-0.79]

4 We also show that both these problems are closely related and an approximation algorithm solving one can be used to obtain an approximation guarantee for the other. [sent-9, score-0.282]

5 We provide hardness results for both problems thus showing that our approximation factors are tight up to log-factors. [sent-10, score-0.246]

6 1 Introduction A set function f : 2V → R is said to be submodular [4] if for all subsets S, T ⊆ V , it holds that f (S) + f (T ) ≥ f (S ∪ T ) + f (S ∩ T ). [sent-12, score-0.709]

7 Defining f (j|S) f (S ∪ j) − f (S) as the gain of j ∈ V in the context of S ⊆ V , then f is submodular if and only if f (j|S) ≥ f (j|T ) for all S ⊆ T and j ∈ T . [sent-13, score-0.709]

8 While general set function optimization is often intractable, many forms of submodular function optimization can be solved near optimally or even optimally in certain cases. [sent-16, score-0.793]

9 In this paper, we study a new class of discrete optimization problems that have the following form: Problem 1 (SCSC): min{f (X) | g(X) ≥ c}, and Problem 2 (SCSK): max{g(X) | f (X) ≤ b}, where f and g are monotone non-decreasing submodular functions that also, w. [sent-18, score-0.849]

10 The corresponding constraints are called the submodular cover [29] and submodular knapsack [1] respectively and hence we refer to Problem 1 as Submodular Cost Submodular Cover (henceforth SCSC) and Problem 2 as Submodular Cost Submodular Knapsack (henceforth SCSK). [sent-23, score-1.722]

11 Our motivation stems from an interesting class of problems that require minimizing a certain submodular function f while simultaneously maximizing another submodular function g. [sent-24, score-1.487]

12 We shall see that these naturally 1 A monotone non-decreasing normalized (f (∅) = 0) submodular function is called a polymatroid function. [sent-25, score-0.8]

13 A standard approach used in literature [9, 25, 15] has been to transform these problems into minimizing the difference between submodular functions (also called DS optimization): Problem 0: min f (X) − g(X) . [sent-27, score-0.79]

14 On the other hand, in many applications, one of the submodular functions naturally serves as part of a constraint. [sent-30, score-0.763]

15 Often the objective functions encouraging small vocabulary subsets and large acoustic spans are submodular [23, 20] and hence this problem can naturally be cast as an instance of Problems 1 and 2. [sent-47, score-0.82]

16 Machine Translation: Another application in machine translation is to choose a subset of training data that is optimized for given test data set, a problem previously addressed with modular functions [24]. [sent-50, score-0.288]

17 Defining a submodular function with ground set over the union of training and test sample inputs V = Vtr ∪ Vte , we can set f : 2Vtr → R+ to f (X) = f (X|Vte ), and take g(X) = |X|, and b ≈ 0 in Problem 2 to address this problem. [sent-51, score-0.709]

18 For example the Submodular Set Cover problem (henceforth SSC) [29] occurs as a special case of Problem 1, with f being modular and g is submodular. [sent-54, score-0.202]

19 Similarly the Submodular Cost Knapsack problem (henceforth SK) [28] is a special case of problem 2 again when f is modular and g submodular. [sent-55, score-0.202]

20 When both f and g are modular, Problems 1 and 2 are called knapsack problems [16]. [sent-57, score-0.256]

21 We also show that many combinatorial algorithms like the greedy algorithm for SK [28] and SSC [29] also belong to this framework and provide the first constant-factor bi-criterion approximation algorithm for SSC [29] and hence the general set cover problem [3]. [sent-62, score-0.321]

22 We then show how with suitable choices of approximate functions, we can obtain a number of bounded approximation guarantees and show the hardness for Problems 1 and 2, which in fact match some of our approximation guarantees. [sent-63, score-0.226]

23 Our guarantees and hardness results depend on the curvature of the submodular functions [2]. [sent-64, score-0.912]

24 We observe a strong asymmetry in the results that the factors change 2 polynomially based on the curvature of f but only by a constant-factor with the curvature of g, hence making the SK and SSC much easier compared to SCSK and SCSC. [sent-65, score-0.203]

25 Given a submodular function f , we define the total curvature, κf as2 : κf = 1 − minj∈V f (j|V \j) [2]. [sent-68, score-0.709]

26 f (j) Intuitively, the curvature 0 ≤ κf ≤ 1 measures the distance of f from modularity and κf = 0 if and only if f is modular (or additive, i. [sent-69, score-0.294]

27 A number of approximation guarantees in the context of submodular optimization have been refined via the curvature of the submodular function [2, 13, 12]. [sent-72, score-1.631]

28 In this paper, we shall witness the role of curvature also in determining the approximations and the hardness of problems 1 and 2. [sent-73, score-0.19]

29 2: Choose surrogate functions ft and gt for f ˆ ˆ We would like to choose surrogate functions ft and g respectively, tight at X t−1 . [sent-76, score-0.544]

30 The surrogate functions we consider ˆ in this paper are in the forms of bounds (upper or lower) and approximations. [sent-80, score-0.188]

31 Modular lower bounds: Akin to convex functions, submodular functions have tight modular lower bounds. [sent-81, score-1.0]

32 These bounds are related to the subdifferential ∂f (Y ) of the submodular set function f at a set Y ⊆ V [4]. [sent-82, score-0.735]

33 Y Y Y Modular upper bounds: We can also define superdifferentials ∂ f (Y ) of a submodular function [14, 10] at Y . [sent-93, score-0.727]

34 It is possible, moreover, to provide specific supergradients [10, 13] that define the following two modular upper bounds (when referring either one, we use mf ): X mf (Y ) X,1 f (X) − j∈X\Y f (j|X\j) + f (j|∅), mf (Y ) X,2 j∈Y \X f (X) − j∈X\Y f (j|V \j) + f (j|X). [sent-94, score-0.483]

35 j∈Y \X Then mf (Y ) ≥ f (Y ) and mf (Y ) ≥ f (Y ), ∀Y ⊆ V and mf (X) = mf (X) = f (X). [sent-95, score-0.316]

36 In particular, choosing ft as a modular upper bound of f tight at X t , or gt as a ˆ t modular lower bound of g tight at X , or both, ensures that the objective value of Problems 1 and 2 always improves at every iteration as long as the corresponding surrogate problem can be solved exactly. [sent-98, score-0.838]

37 Unfortunately, Problems 1 and 2 are NP-hard even if f or g (or both) are modular [3], and therefore the surrogate problems themselves cannot be solved exactly. [sent-99, score-0.375]

38 Fortunately, the surrogate problems are often much easier than the original ones and can admit log or constant-factor guarantees. [sent-100, score-0.173]

39 What is also fortunate and perhaps surprising, as we show in this paper below, is that unlike the case of DS optimization (where the problem is inapproximable in general [9]), the constrained forms of optimization (Problems 1 and 2) do have approximation guarantees. [sent-103, score-0.144]

40 g that f (j) > 0, g(j) > 0, ∀j ∈ V 3 Ellipsoidal Approximation: We also consider ellipsoidal approximations (EA) of f . [sent-107, score-0.14]

41 al [6] is to provide an algorithm based on approximating the submodular polyhedron by an ellipsoid. [sent-109, score-0.732]

42 They show that for any polymatroid function f , one can compute an approximation of the form wf (X) for a certain modular weight vector wf ∈ RV , such √ that wf (X) ≤ f (X) ≤ O( n log n) wf (X), ∀X ⊆ V . [sent-110, score-0.796]

43 Then, the submodular function f ea (X) = f κ (X) + (1 − κ ) κf w f j∈X f (j) satisfies [12]: √ ea f (X) ≤ f (X) ≤ O n log n 1 + ( n log n − 1)(1 − κf ) √ f ea (X), ∀X ⊆ V (2) √ f ea is multiplicatively bounded by f by a factor depending on n and the curvature. [sent-112, score-1.129]

44 In particular, the surrogate ˆ ˆ functions f or g in Algorithm 1 can be the ellipsoidal approximations above, and the multiplicative bounds transform into approximation guarantees for these problems. [sent-114, score-0.427]

45 Algorithms 2 and 3 provide the schematics for using an approximation algorithm for one of the problems for solving the other. [sent-143, score-0.144]

46 4 4 Approximation Algorithms We consider several algorithms for Problems 1 and 2, which can all be characterized by the framework of Algorithm 1, using the surrogate functions of the form of upper/lower bounds or approximations. [sent-155, score-0.207]

47 We first investigate a special case, the submodular set cover (SSC), and then provide two algorithms, one of them (ISSC) is very practical with a weaker theoretical guarantee, and another one (EASSC) which is slow but has the tightest guarantee. [sent-159, score-0.822]

48 Submodular Set Cover (SSC): We start by considering a classical special case of SCSC (Problem 1) where f is already a modular function and g is a submodular function. [sent-160, score-0.911]

49 This problem was first investigated by Wolsey [29], wherein he showed that a simple greedy algorithm achieves bounded (in fact, log-factor) approximation guarantees. [sent-162, score-0.185]

50 We show that this greedy algorithm can naturally be viewed in the framework of our Algorithm 1 by choosing appropriate surrogate ˆ ˆ functions ft and gt . [sent-163, score-0.407]

51 The idea is to use the modular function f as its own surrogate ft and choose the ˆ function gt as a modular lower bound of g. [sent-164, score-0.667]

52 The greedy algorithm for SSC [29] can be seen as an instance of Algorithm 1 by ˆ choosing the surrogate function f as f and g as hπ (with π defined in Eqn. [sent-172, score-0.26]

53 ˆ When g is integral, the guarantee of the greedy algorithm is Hg H(maxj g(j)), where d H(d) = i=1 1 [29] (henceforth we will use Hg for this quantity). [sent-174, score-0.173]

54 Using the greedy algorithm for SK [28] as the approximation oracle in Algorithm 3 provides a [1 + , 1 − e−1 ] bi-criterion approximation algorithm for SSC, for any > 0. [sent-180, score-0.283]

55 This result follows immediately from the guarantee of the submodular cost knapsack problem [28] and Theorem 3. [sent-182, score-1.007]

56 We remark that we can also use a simpler version of the greedy iteration at every iteration [21, 17] and we obtain a guarantee of (1 + , 1/2(1 − e−1 )). [sent-184, score-0.227]

57 The idea here is to iteratively solve the submodular set cover problem which can be done by replacing f by a modular upper bound at every iteration. [sent-187, score-1.042]

58 In particular, this can be seen as a variant of Algorithm 1, where we start with X 0 = ∅ and ˆ choose ft (X) = mf t (X) at every iteration. [sent-188, score-0.152]

59 The surrogate problem at each iteration becomes X f min{mX t (X)|g(X) ≥ c}. [sent-189, score-0.154]

60 ISSC obtains an approximation factor of 1+(Kg −1)(1−κf ) ≤ 1+(n−1)(1−κf ) Hg where Kg = 1 + max{|X| : g(X) < c} and Hg is the approximation factor of the submodular set cover using g. [sent-196, score-1.035]

61 When f is modular (κf = 0), we recover the approximation guarantee of the submodular set cover problem. [sent-200, score-1.143]

62 Ellipsoidal Approximation based Submodular Set Cover (EASSC): In this setting, we use the ellipsoidal approximation discussed in §2. [sent-203, score-0.215]

63 We can compute the κf -curve-normalized version of f √ (f κ , see §2), and then compute its ellipsoidal approximation wf κ . [sent-204, score-0.336]

64 We then define the function ˆ ˆ f (X) = f ea (X) = κf wf κ (X) + (1 − κf ) j∈X f (j) and use this as the surrogate function f for f . [sent-205, score-0.348]

65 min κf wf κ (X) + (1 − κf )   j∈X ˆ While function f (X) = f ea (X) is not modular, it is a weighted sum of a concave over modular function and a modular function. [sent-208, score-0.625]

66 Fortunately, we can use the result from [26], where they show that any function of the form of w1 (X) + w2 (X) can be optimized over any polytope P with an approximation factor of β(1 + ) for any > 0, where β is the approximation factor of optimizing a modular function over P. [sent-209, score-0.392]

67 We use their algorithm to minimize f ea (X) over the submodular set cover constraint and hence we call this algorithm EASSC. [sent-211, score-0.967]

68 EASSC obtains a guarantee of O( 1+(√n log n−1)(1−κf ) ), where Hg is the approximation guarantee of the set cover problem. [sent-214, score-0.337]

69 In particular, we can minimize (f ea (X))2 = wf (X) at every iteration, giving a surrogate problem of the form min{wf (X)|g(X) ≥ c}. [sent-216, score-0.348]

70 EASSCc obtains an approximation guarantee of O( n log n Hg ). [sent-221, score-0.18]

71 We first investigate a special case, the submodular knapsack (SK), and then provide three algorithms, two of them (Gr and ISK) being practical with slightly weaker theoretical guarantee, and another one (EASK) which is not scalable but has the tightest guarantee. [sent-226, score-0.938]

72 Submodular Cost Knapsack (SK): We start with a special case of SCSK (Problem 2), where f is a modular function and g is a submodular function. [sent-227, score-0.911]

73 In this case, SCSK turns into the SK problem for which the greedy algorithm with partial enumeration provides a 1 − e−1 approximation [28]. [sent-228, score-0.185]

74 The greedy algorithm can be seen as an instance of Algorithm 1 with g being the modular lower bound of ˆ ˆ g and f being f , which is already modular. [sent-229, score-0.331]

75 Choosing the surrogate function f as f and g as hπ (with π defined in eqn (5)) in ˆ Algorithm 1 with appropriate initialization obtains a guarantee of 1 − 1/e for SK. [sent-234, score-0.232]

76 Greedy (Gr): A similar greedy algorithm can provide approximation guarantees for the general SCSK problem, with submodular f and g. [sent-235, score-0.918]

77 Unlike the knapsack case in (5), however, at iteration π i we choose an element j ∈ Si−1 : f (Si−1 ∪ {j}) ≤ b which maximizes g(j|Si−1 ). [sent-236, score-0.261]

78 In particular it holds for maximizing a submodular function g over any down monotone constraint [2]. [sent-245, score-0.787]

79 ˆ Iterated Submodular Cost Knapsack (ISK): Here, we choose ft (X) as a modular upper bound t of f , tight at X . [sent-248, score-0.366]

80 Then at every iteration, we solve max{g(X)|mf t (X) ≤ b}, which is ˆ X a submodular maximization problem subject to a knapsack constraint (SK). [sent-250, score-0.937]

81 Ellipsoidal Approximation based Submodular Cost Knapsack (EASK): Choosing the Ellipsoidal Approximation f ea of f as a surrogate function, we obtain a simpler problem:     κ max g(X) κf wf (X) + (1 − κf ) f (j) ≤ b . [sent-260, score-0.371]

82 In the case when the submodular function has a curvature κf = 1, we can actually provide a simpler algorithm without needing to use the conversion algorithm (Algorithm 2). [sent-271, score-0.87]

83 In this case, we can directly choose the ellipsoidal approximation of f as wf (X) and solve the surrogate problem: max{g(X) : wf (X) ≤ b2 }. [sent-272, score-0.608]

84 This surrogate problem is a submodular cost knapsack problem, which we can solve using the greedy algorithm. [sent-273, score-1.158]

85 These include multiple covering and knapsack constraints – i. [sent-281, score-0.23]

86 We also consider SCSC and SCSK with non-monotone submodular functions. [sent-284, score-0.709]

87 For any κ > 0, there exists submodular functions with curvature κ such that no polynomial time algorithm for Problems 1 and 2 achieves a bi-criterion factor better than σ n1/2− ρ = 1+(n1/2− −1)(1−κ) for any > 0. [sent-291, score-0.879]

88 This is not surprising since the submodular set cover problem and the submodular cost knapsack problem both have constant factor guarantees. [sent-296, score-1.767]

89 We are motivated by the speech data subset selection application [20, 23] with the submodular function f encouraging limited vocabulary while g tries to achieve acoustic variability. [sent-298, score-0.824]

90 For the coverage function g, we use two types of coverage: one is a facility location function g1 (X) = i∈V maxj∈X sij while the other is a saturated sum function g2 (X) = i∈V min{ j∈X sij , α j∈V sij }. [sent-300, score-0.193]

91 EASSCc ISK Gr EASKc Random 6 g(X) g(X) ISSC ISSC EASSCc ISK Gr EASKc Random Discussions In this paper, we propose a unifying framework for problems 1 and 2 based on suitable surrogate functions. [sent-320, score-0.173]

92 We provide a number of iterative algorithms which are very practical and scalable (like Gr, ISK and ISSC), and also algorithms like EASSC and EASK, which though more intensive, obtain tight approximation bounds. [sent-321, score-0.188]

93 For future work, we would like to empirically evaluate our algorithms on many of the real world problems described above, particularly the limited vocabulary data subset selection application for speech corpora, and the machine translation application. [sent-323, score-0.188]

94 Submodular set functions, matroids and the greedy algorithm: tight worstcase bounds and some generalizations of the Rado-Edmonds theorem. [sent-335, score-0.167]

95 Algorithms for approximate minimization of the difference between submodular functions, with applications. [sent-375, score-0.709]

96 The submodular Bregman and Lov´ sz-Bregman divergences with applications. [sent-380, score-0.709]

97 Submodularity beyond submodular energies: coupling edges in graph cuts. [sent-402, score-0.709]

98 A note on the budgeted maximization on submodular functions. [sent-418, score-0.728]

99 A note on maximizing a submodular set function subject to a knapsack constraint. [sent-475, score-0.942]

100 An analysis of the greedy algorithm for the submodular set covering problem. [sent-480, score-0.839]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('submodular', 0.709), ('scsk', 0.246), ('scsc', 0.217), ('knapsack', 0.21), ('modular', 0.202), ('ssc', 0.155), ('ellipsoidal', 0.14), ('surrogate', 0.127), ('wf', 0.121), ('isk', 0.116), ('issc', 0.116), ('eassc', 0.101), ('hg', 0.101), ('ea', 0.1), ('cover', 0.094), ('curvature', 0.092), ('eask', 0.087), ('greedy', 0.087), ('mf', 0.079), ('approximation', 0.075), ('kf', 0.075), ('si', 0.073), ('iyer', 0.071), ('gr', 0.065), ('guarantee', 0.063), ('easkc', 0.058), ('easscc', 0.058), ('tight', 0.054), ('sk', 0.053), ('hardness', 0.052), ('ft', 0.049), ('xa', 0.048), ('problems', 0.046), ('jegelka', 0.044), ('gt', 0.044), ('obtains', 0.042), ('henceforth', 0.042), ('hy', 0.038), ('vocabulary', 0.038), ('monotone', 0.037), ('sij', 0.037), ('xc', 0.037), ('sensor', 0.037), ('xb', 0.036), ('saturated', 0.035), ('kg', 0.035), ('polymatroid', 0.035), ('speech', 0.035), ('functions', 0.035), ('minj', 0.029), ('vte', 0.029), ('bipartite', 0.028), ('translation', 0.027), ('iteration', 0.027), ('subgradient', 0.027), ('optimizer', 0.026), ('bounds', 0.026), ('guillory', 0.025), ('inapproximable', 0.025), ('facility', 0.025), ('placement', 0.025), ('cost', 0.025), ('akin', 0.024), ('choose', 0.024), ('guarantees', 0.024), ('bilmes', 0.024), ('timit', 0.024), ('choosing', 0.023), ('selection', 0.023), ('algorithm', 0.023), ('maximizing', 0.023), ('discussions', 0.023), ('simpler', 0.023), ('ds', 0.022), ('krause', 0.022), ('goemans', 0.022), ('interspeech', 0.022), ('subsume', 0.022), ('coverage', 0.022), ('optimization', 0.022), ('lin', 0.021), ('defer', 0.021), ('iterative', 0.021), ('covering', 0.02), ('factor', 0.02), ('optimally', 0.02), ('acoustic', 0.019), ('budgeted', 0.019), ('naturally', 0.019), ('bound', 0.019), ('factors', 0.019), ('algorithms', 0.019), ('tightest', 0.019), ('cooperative', 0.019), ('gi', 0.018), ('privacy', 0.018), ('summarization', 0.018), ('upper', 0.018), ('constraint', 0.018), ('nh', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 319 nips-2013-Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints

Author: Rishabh K. Iyer, Jeff A. Bilmes

Abstract: We investigate two new optimization problems — minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). We are motivated by a number of real-world applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost). These problems are often posed as minimizing the difference between submodular functions [9, 25] which is in the worst case inapproximable. We show, however, that by phrasing these problems as constrained optimization, which is more natural for many applications, we achieve a number of bounded approximation guarantees. We also show that both these problems are closely related and an approximation algorithm solving one can be used to obtain an approximation guarantee for the other. We provide hardness results for both problems thus showing that our approximation factors are tight up to log-factors. Finally, we empirically demonstrate the performance and good scalability properties of our algorithms. 1

2 0.70540166 78 nips-2013-Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions

Author: Rishabh K. Iyer, Stefanie Jegelka, Jeff A. Bilmes

Abstract: We investigate three related and important problems connected to machine learning: approximating a submodular function everywhere, learning a submodular function (in a PAC-like setting [28]), and constrained minimization of submodular functions. We show that the complexity of all three problems depends on the “curvature” of the submodular function, and provide lower and upper bounds that refine and improve previous results [2, 6, 8, 27]. Our proof techniques are fairly generic. We either use a black-box transformation of the function (for approximation and learning), or a transformation of algorithms to use an appropriate surrogate function (for minimization). Curiously, curvature has been known to influence approximations for submodular maximization [3, 29], but its effect on minimization, approximation and learning has hitherto been open. We complete this picture, and also support our theoretical claims by empirical results. 1

3 0.38648215 268 nips-2013-Reflection methods for user-friendly submodular optimization

Author: Stefanie Jegelka, Francis Bach, Suvrit Sra

Abstract: Recently, it has become evident that submodularity naturally captures widely occurring concepts in machine learning, signal processing and computer vision. Consequently, there is need for efficient optimization procedures for submodular functions, especially for minimization problems. While general submodular minimization is challenging, we propose a new method that exploits existing decomposability of submodular functions. In contrast to previous approaches, our method is neither approximate, nor impractical, nor does it need any cumbersome parameter tuning. Moreover, it is easy to implement and parallelize. A key component of our method is a formulation of the discrete submodular minimization problem as a continuous best approximation problem that is solved through a sequence of reflections, and its solution can be easily thresholded to obtain an optimal discrete solution. This method solves both the continuous and discrete formulations of the problem, and therefore has applications in learning, inference, and reconstruction. In our experiments, we illustrate the benefits of our method on two image segmentation tasks. 1

4 0.35006893 97 nips-2013-Distributed Submodular Maximization: Identifying Representative Elements in Massive Data

Author: Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause

Abstract: Many large-scale machine learning problems (such as clustering, non-parametric learning, kernel machines, etc.) require selecting, out of a massive data set, a manageable yet representative subset. Such problems can often be reduced to maximizing a submodular set function subject to cardinality constraints. Classical approaches require centralized access to the full data set; but for truly large-scale problems, rendering the data centrally is often impractical. In this paper, we consider the problem of submodular function maximization in a distributed fashion. We develop a simple, two-stage protocol G REE D I, that is easily implemented using MapReduce style computations. We theoretically analyze our approach, and show, that under certain natural conditions, performance close to the (impractical) centralized approach can be achieved. In our extensive experiments, we demonstrate the effectiveness of our approach on several applications, including sparse Gaussian process inference and exemplar-based clustering, on tens of millions of data points using Hadoop. 1

5 0.27032372 29 nips-2013-Adaptive Submodular Maximization in Bandit Setting

Author: Victor Gabillon, Branislav Kveton, Zheng Wen, Brian Eriksson, S. Muthukrishnan

Abstract: Maximization of submodular functions has wide applications in machine learning and artificial intelligence. Adaptive submodular maximization has been traditionally studied under the assumption that the model of the world, the expected gain of choosing an item given previously selected items and their states, is known. In this paper, we study the setting where the expected gain is initially unknown, and it is learned by interacting repeatedly with the optimized function. We propose an efficient algorithm for solving our problem and prove that its expected cumulative regret increases logarithmically with time. Our regret bound captures the inherent property of submodular maximization, earlier mistakes are more costly than later ones. We refer to our approach as Optimistic Adaptive Submodular Maximization (OASM) because it trades off exploration and exploitation based on the optimism in the face of uncertainty principle. We evaluate our method on a preference elicitation problem and show that non-trivial K-step policies can be learned from just a few hundred interactions with the problem. 1

6 0.1081522 207 nips-2013-Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

7 0.093177937 359 nips-2013-Σ-Optimality for Active Learning on Gaussian Random Fields

8 0.077194445 259 nips-2013-Provable Subspace Clustering: When LRR meets SSC

9 0.071897961 291 nips-2013-Sensor Selection in High-Dimensional Gaussian Trees with Nuisances

10 0.059638251 23 nips-2013-Active Learning for Probabilistic Hypotheses Using the Maximum Gibbs Error Criterion

11 0.056623813 240 nips-2013-Optimization, Learning, and Games with Predictable Sequences

12 0.05638276 313 nips-2013-Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization

13 0.045956813 2 nips-2013-(Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings

14 0.044927388 72 nips-2013-Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses

15 0.044837292 125 nips-2013-From Bandits to Experts: A Tale of Domination and Independence

16 0.040300604 132 nips-2013-Global MAP-Optimality by Shrinking the Combinatorial Search Area with Convex Relaxation

17 0.040106997 35 nips-2013-Analyzing the Harmonic Structure in Graph-Based Learning

18 0.038919542 249 nips-2013-Polar Operators for Structured Sparse Estimation

19 0.038502611 150 nips-2013-Learning Adaptive Value of Information for Structured Prediction

20 0.038452636 191 nips-2013-Minimax Optimal Algorithms for Unconstrained Linear Optimization


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.125), (1, -0.018), (2, 0.069), (3, 0.017), (4, 0.097), (5, 0.165), (6, -0.367), (7, -0.55), (8, 0.172), (9, -0.224), (10, -0.155), (11, -0.023), (12, -0.159), (13, 0.112), (14, -0.015), (15, 0.062), (16, 0.13), (17, -0.032), (18, 0.028), (19, -0.066), (20, 0.006), (21, -0.008), (22, -0.077), (23, 0.023), (24, -0.003), (25, 0.026), (26, -0.03), (27, -0.023), (28, 0.01), (29, -0.064), (30, 0.075), (31, -0.06), (32, -0.01), (33, 0.004), (34, 0.009), (35, -0.034), (36, 0.006), (37, -0.029), (38, 0.011), (39, 0.005), (40, 0.018), (41, 0.033), (42, 0.021), (43, -0.004), (44, 0.036), (45, -0.023), (46, -0.014), (47, 0.022), (48, 0.056), (49, -0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96507424 319 nips-2013-Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints

Author: Rishabh K. Iyer, Jeff A. Bilmes

Abstract: We investigate two new optimization problems — minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). We are motivated by a number of real-world applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost). These problems are often posed as minimizing the difference between submodular functions [9, 25] which is in the worst case inapproximable. We show, however, that by phrasing these problems as constrained optimization, which is more natural for many applications, we achieve a number of bounded approximation guarantees. We also show that both these problems are closely related and an approximation algorithm solving one can be used to obtain an approximation guarantee for the other. We provide hardness results for both problems thus showing that our approximation factors are tight up to log-factors. Finally, we empirically demonstrate the performance and good scalability properties of our algorithms. 1

2 0.957515 78 nips-2013-Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions

Author: Rishabh K. Iyer, Stefanie Jegelka, Jeff A. Bilmes

Abstract: We investigate three related and important problems connected to machine learning: approximating a submodular function everywhere, learning a submodular function (in a PAC-like setting [28]), and constrained minimization of submodular functions. We show that the complexity of all three problems depends on the “curvature” of the submodular function, and provide lower and upper bounds that refine and improve previous results [2, 6, 8, 27]. Our proof techniques are fairly generic. We either use a black-box transformation of the function (for approximation and learning), or a transformation of algorithms to use an appropriate surrogate function (for minimization). Curiously, curvature has been known to influence approximations for submodular maximization [3, 29], but its effect on minimization, approximation and learning has hitherto been open. We complete this picture, and also support our theoretical claims by empirical results. 1

3 0.8186425 97 nips-2013-Distributed Submodular Maximization: Identifying Representative Elements in Massive Data

Author: Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause

Abstract: Many large-scale machine learning problems (such as clustering, non-parametric learning, kernel machines, etc.) require selecting, out of a massive data set, a manageable yet representative subset. Such problems can often be reduced to maximizing a submodular set function subject to cardinality constraints. Classical approaches require centralized access to the full data set; but for truly large-scale problems, rendering the data centrally is often impractical. In this paper, we consider the problem of submodular function maximization in a distributed fashion. We develop a simple, two-stage protocol G REE D I, that is easily implemented using MapReduce style computations. We theoretically analyze our approach, and show, that under certain natural conditions, performance close to the (impractical) centralized approach can be achieved. In our extensive experiments, we demonstrate the effectiveness of our approach on several applications, including sparse Gaussian process inference and exemplar-based clustering, on tens of millions of data points using Hadoop. 1

4 0.77314353 268 nips-2013-Reflection methods for user-friendly submodular optimization

Author: Stefanie Jegelka, Francis Bach, Suvrit Sra

Abstract: Recently, it has become evident that submodularity naturally captures widely occurring concepts in machine learning, signal processing and computer vision. Consequently, there is need for efficient optimization procedures for submodular functions, especially for minimization problems. While general submodular minimization is challenging, we propose a new method that exploits existing decomposability of submodular functions. In contrast to previous approaches, our method is neither approximate, nor impractical, nor does it need any cumbersome parameter tuning. Moreover, it is easy to implement and parallelize. A key component of our method is a formulation of the discrete submodular minimization problem as a continuous best approximation problem that is solved through a sequence of reflections, and its solution can be easily thresholded to obtain an optimal discrete solution. This method solves both the continuous and discrete formulations of the problem, and therefore has applications in learning, inference, and reconstruction. In our experiments, we illustrate the benefits of our method on two image segmentation tasks. 1

5 0.57600027 29 nips-2013-Adaptive Submodular Maximization in Bandit Setting

Author: Victor Gabillon, Branislav Kveton, Zheng Wen, Brian Eriksson, S. Muthukrishnan

Abstract: Maximization of submodular functions has wide applications in machine learning and artificial intelligence. Adaptive submodular maximization has been traditionally studied under the assumption that the model of the world, the expected gain of choosing an item given previously selected items and their states, is known. In this paper, we study the setting where the expected gain is initially unknown, and it is learned by interacting repeatedly with the optimized function. We propose an efficient algorithm for solving our problem and prove that its expected cumulative regret increases logarithmically with time. Our regret bound captures the inherent property of submodular maximization, earlier mistakes are more costly than later ones. We refer to our approach as Optimistic Adaptive Submodular Maximization (OASM) because it trades off exploration and exploitation based on the optimism in the face of uncertainty principle. We evaluate our method on a preference elicitation problem and show that non-trivial K-step policies can be learned from just a few hundred interactions with the problem. 1

6 0.31702006 207 nips-2013-Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

7 0.23693039 359 nips-2013-Σ-Optimality for Active Learning on Gaussian Random Fields

8 0.20062228 328 nips-2013-The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited

9 0.17103565 132 nips-2013-Global MAP-Optimality by Shrinking the Combinatorial Search Area with Convex Relaxation

10 0.16377372 115 nips-2013-Factorized Asymptotic Bayesian Inference for Latent Feature Models

11 0.16146065 199 nips-2013-More data speeds up training time in learning halfspaces over sparse vectors

12 0.15469399 291 nips-2013-Sensor Selection in High-Dimensional Gaussian Trees with Nuisances

13 0.15112397 80 nips-2013-Data-driven Distributionally Robust Polynomial Optimization

14 0.15029831 142 nips-2013-Information-theoretic lower bounds for distributed statistical estimation with communication constraints

15 0.14797807 288 nips-2013-Scalable Influence Estimation in Continuous-Time Diffusion Networks

16 0.14589196 247 nips-2013-Phase Retrieval using Alternating Minimization

17 0.1452188 249 nips-2013-Polar Operators for Structured Sparse Estimation

18 0.14333527 33 nips-2013-An Approximate, Efficient LP Solver for LP Rounding

19 0.13970703 112 nips-2013-Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising

20 0.1382076 137 nips-2013-High-Dimensional Gaussian Process Bandits


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.026), (13, 0.075), (16, 0.015), (29, 0.24), (33, 0.088), (34, 0.084), (41, 0.03), (49, 0.023), (56, 0.114), (70, 0.03), (85, 0.025), (89, 0.024), (93, 0.063), (95, 0.065)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.71671551 319 nips-2013-Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints

Author: Rishabh K. Iyer, Jeff A. Bilmes

Abstract: We investigate two new optimization problems — minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). We are motivated by a number of real-world applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost). These problems are often posed as minimizing the difference between submodular functions [9, 25] which is in the worst case inapproximable. We show, however, that by phrasing these problems as constrained optimization, which is more natural for many applications, we achieve a number of bounded approximation guarantees. We also show that both these problems are closely related and an approximation algorithm solving one can be used to obtain an approximation guarantee for the other. We provide hardness results for both problems thus showing that our approximation factors are tight up to log-factors. Finally, we empirically demonstrate the performance and good scalability properties of our algorithms. 1

2 0.61929327 78 nips-2013-Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions

Author: Rishabh K. Iyer, Stefanie Jegelka, Jeff A. Bilmes

Abstract: We investigate three related and important problems connected to machine learning: approximating a submodular function everywhere, learning a submodular function (in a PAC-like setting [28]), and constrained minimization of submodular functions. We show that the complexity of all three problems depends on the “curvature” of the submodular function, and provide lower and upper bounds that refine and improve previous results [2, 6, 8, 27]. Our proof techniques are fairly generic. We either use a black-box transformation of the function (for approximation and learning), or a transformation of algorithms to use an appropriate surrogate function (for minimization). Curiously, curvature has been known to influence approximations for submodular maximization [3, 29], but its effect on minimization, approximation and learning has hitherto been open. We complete this picture, and also support our theoretical claims by empirical results. 1

3 0.59068054 170 nips-2013-Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space

Author: Xinhua Zhang, Wee Sun Lee, Yee Whye Teh

Abstract: Incorporating invariance information is important for many learning problems. To exploit invariances, most existing methods resort to approximations that either lead to expensive optimization problems such as semi-definite programming, or rely on separation oracles to retain tractability. Some methods further limit the space of functions and settle for non-convex models. In this paper, we propose a framework for learning in reproducing kernel Hilbert spaces (RKHS) using local invariances that explicitly characterize the behavior of the target function around data instances. These invariances are compactly encoded as linear functionals whose value are penalized by some loss function. Based on a representer theorem that we establish, our formulation can be efficiently optimized via a convex program. For the representer theorem to hold, the linear functionals are required to be bounded in the RKHS, and we show that this is true for a variety of commonly used RKHS and invariances. Experiments on learning with unlabeled data and transform invariances show that the proposed method yields better or similar results compared with the state of the art. 1

4 0.58726794 268 nips-2013-Reflection methods for user-friendly submodular optimization

Author: Stefanie Jegelka, Francis Bach, Suvrit Sra

Abstract: Recently, it has become evident that submodularity naturally captures widely occurring concepts in machine learning, signal processing and computer vision. Consequently, there is need for efficient optimization procedures for submodular functions, especially for minimization problems. While general submodular minimization is challenging, we propose a new method that exploits existing decomposability of submodular functions. In contrast to previous approaches, our method is neither approximate, nor impractical, nor does it need any cumbersome parameter tuning. Moreover, it is easy to implement and parallelize. A key component of our method is a formulation of the discrete submodular minimization problem as a continuous best approximation problem that is solved through a sequence of reflections, and its solution can be easily thresholded to obtain an optimal discrete solution. This method solves both the continuous and discrete formulations of the problem, and therefore has applications in learning, inference, and reconstruction. In our experiments, we illustrate the benefits of our method on two image segmentation tasks. 1

5 0.5795247 184 nips-2013-Marginals-to-Models Reducibility

Author: Tim Roughgarden, Michael Kearns

Abstract: We consider a number of classical and new computational problems regarding marginal distributions, and inference in models specifying a full joint distribution. We prove general and efficient reductions between a number of these problems, which demonstrate that algorithmic progress in inference automatically yields progress for “pure data” problems. Our main technique involves formulating the problems as linear programs, and proving that the dual separation oracle required by the ellipsoid method is provided by the target problem. This technique may be of independent interest in probabilistic inference. 1

6 0.57912678 271 nips-2013-Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima

7 0.57788378 249 nips-2013-Polar Operators for Structured Sparse Estimation

8 0.57779837 215 nips-2013-On Decomposing the Proximal Map

9 0.57717329 94 nips-2013-Distributed $k$-means and $k$-median Clustering on General Topologies

10 0.57705879 309 nips-2013-Statistical Active Learning Algorithms

11 0.57566208 102 nips-2013-Efficient Algorithm for Privately Releasing Smooth Queries

12 0.57557279 23 nips-2013-Active Learning for Probabilistic Hypotheses Using the Maximum Gibbs Error Criterion

13 0.57523036 69 nips-2013-Context-sensitive active sensing in humans

14 0.57380271 99 nips-2013-Dropout Training as Adaptive Regularization

15 0.57089204 150 nips-2013-Learning Adaptive Value of Information for Structured Prediction

16 0.57027119 318 nips-2013-Structured Learning via Logistic Regression

17 0.56972194 294 nips-2013-Similarity Component Analysis

18 0.56902164 154 nips-2013-Learning Gaussian Graphical Models with Observed or Latent FVSs

19 0.56850934 149 nips-2013-Latent Structured Active Learning

20 0.56806535 33 nips-2013-An Approximate, Efficient LP Solver for LP Rounding