nips nips2009 nips2009-45 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Elad Hazan, Satyen Kale
Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. 1
Reference: text
sentIndex sentText sentNum sentScore
1 We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. [sent-6, score-0.207]
2 The performance of an online learning algorithm is measured in terms of its regret, which is the difference between the total cost of the decisions it chooses, and the cost of the optimal decision chosen in hindsight. [sent-8, score-0.423]
3 A Hannan-consistent algorithm is one that achieves sublinear regret (as a function of the number of decision-making rounds). [sent-9, score-0.341]
4 Hannanconsistency implies that the average per round cost of the algorithm converges to that of the optimal decision in hindsight. [sent-10, score-0.245]
5 In the past few decades, a variety of Hannan-consistent algorithms have been devised for a wide range of decision spaces and cost functions, including well-known settings such as prediction from expert advice [10], online convex optimization [15], etc. [sent-11, score-0.343]
6 Despite this success, many online decisionmaking problems still remain open, especially when the decision space is discrete and large (say, exponential size in the problem parameters) and the cost functions are non-linear. [sent-13, score-0.302]
7 Our decision space is now the set of all subsets of a ground set of n elements, and the cost functions are assumed to be submodular. [sent-15, score-0.229]
8 A crucial component in these latter results are the celebrated polynomial time algorithms for submodular function minimization [7]. [sent-17, score-0.53]
9 To motivate the online decision-making problem with submodular cost functions, here is an example from [11]. [sent-18, score-0.608]
10 Maximizing profit is equivalent to minimizing the function −P , which is easily seen to be submodular as well. [sent-28, score-0.426]
11 The online decision problem which arises is now to decide which set of products to produce, to maximize profits in the long run, without knowing in advance the cost function or the market prices. [sent-29, score-0.302]
12 In each iteration, we choose a subset of a ground set of n elements, and then observe a submodular cost function which gives the cost of the subset we chose. [sent-32, score-0.642]
13 The goal is to minimize the regret, which is the difference between the total cost of the subsets we chose, and the cost of the best subset in hindsight. [sent-33, score-0.236]
14 In the bandit setting, we only get to observe the cost of the subset we chose, and no other information is revealed. [sent-36, score-0.25]
15 Obviously, if we ignore the special structure of these problems, standard algorithms for learning with expert advice and/or with bandit feedback can be applied to this setting. [sent-37, score-0.206]
16 In addition, for the submodular bandits problem, even the regret bounds have an exponential dependence on n. [sent-39, score-0.741]
17 For the bandit version an even more basic question arises: does there exist an algorithm with regret which depends only polynomially on n? [sent-41, score-0.484]
18 We give efficient algorithms for both problems, with regret which is bounded by a polynomial in n – the underlying dimension – and sublinearly in the number of iterations. [sent-43, score-0.407]
19 For the full information setting, we give two different ran√ domized algorithms with expected regret O(n T ). [sent-44, score-0.393]
20 We make crucial use of a continuous extension of a submodular function known as the Lov´ sz extension. [sent-50, score-0.679]
21 We obtain our regret bounds by running a (sub)gradient descent algorithm in the a style of Zinkevich [15]. [sent-51, score-0.408]
22 For the bandit setting, we give a randomized algorithm with expected regret O(nT 2/3 ). [sent-52, score-0.566]
23 This algorithm also makes use of the Lov´ sz extension and gradient descent. [sent-53, score-0.332]
24 The algorithm folds exploration a and exploitation steps into a single sample and obtains the stated regret bound. [sent-54, score-0.362]
25 We also show that these regret bounds hold with high probability. [sent-55, score-0.335]
26 Note that the technique of Flaxman, Kalai and McMahan [1], when applied to the Lov´ sz extension, gives a worse regret bound of O(nT 3/4 ). [sent-56, score-0.604]
27 A function f : 2[n] → R is called submodular if for all sets S, T ⊆ [n] such that T ⊆ S, and for all elements i ∈ E, we have f (T + i) − f (T ) ≥ f (S + i) − f (S). [sent-66, score-0.406]
28 Given access to a value oracle for a submodular function, it is possible to minimize it in polynomial time [3], and indeed, even in strongly polynomial time [3, 7, 13, 6, 12, 8]. [sent-72, score-0.607]
29 , an online decision maker has to repeatedly chose a subset 2 St ⊆ [n]. [sent-79, score-0.323]
30 In each iteration, after choosing the set St , the cost of the decision is specified by a submodular function ft : 2[n] → [−1, 1]. [sent-80, score-0.942]
31 The regret of the decision maker is defined to be T RegretT := T ft (St ) − min S⊆[n] t=1 ft (S). [sent-82, score-1.238]
32 t=1 If the sets St are chosen by a randomized algorithm, then we consider the expected regret over the randomness in the algorithm. [sent-83, score-0.411]
33 Depending on the kind of feedback the decision maker receives, we distinguish between two settings of the problem: • Full information setting. [sent-86, score-0.198]
34 In this case, in each round t, the decision maker has unlimited access to the value oracles of the previously seen cost function f1 , f2 , . [sent-87, score-0.358]
35 In this case, in each round t, the decision maker only observes the cost of her decision St , viz. [sent-92, score-0.39]
36 In the full information setting of Online Submodular Minimization, there is an efficient randomized algorithm that attains the following regret bound: √ E[RegretT ] ≤ O(n T ). [sent-96, score-0.417]
37 In the bandit setting of Online Submodular Minimization, there is an efficient randomized algorithm that attains the following regret bound: E[RegretT ] ≤ O(nT 2/3 ). [sent-99, score-0.54]
38 A major technical construction we need for the algorithms is the Lov´ sz a a ˆ extension f of the submodular function f . [sent-103, score-0.698]
39 Before defining the Lov´ sz extension, we need the concept of a chain of subsets a of [n]: Definition 1. [sent-105, score-0.339]
40 ˆ Now, we are ready to define1 the Lov´ sz extension f : a 1 Note that this is not the standard definition of the Lov´ sz extension, but an equivalent characterization. [sent-119, score-0.492]
41 Then the value of the Lov´ sz ˆ at x is defined to be extension f p ˆ f (x) := µi f (Ai ). [sent-123, score-0.273]
42 i=0 The preceding discussion gives an equivalent way of defining the Lov´ sz extension: choose a thresha old τ ∈ [0, 1] uniformly at random, and consider the level set Sτ = {i : xi > τ }. [sent-124, score-0.264]
43 We will also need the notion of a maximal chain associated to a point x ∈ K in order to define subgradients of the Lov´ sz extension: a p Definition 3. [sent-127, score-0.362]
44 A maximal chain associated with x is any maximal completion of the Ai chain, i. [sent-129, score-0.205]
45 We have the following key properties of the Lov´ sz extension. [sent-132, score-0.219]
46 The following properties of the Lov´ sz extension f : K → R hold: a ˆ 1. [sent-135, score-0.273]
47 3 The Full Information Setting In this section we give two algorithms for regret minimization in the full information setting, both of √ which attain the same regret bound of O(n T ). [sent-141, score-0.812]
48 The first is a randomized combinatorial algorithm, based on the “follow the leader” approach of Hannan [5] and Kalai-Vempala [9], and the second is an analytical algorithm based on (sub)gradient descent on the Lov´ sz extension. [sent-142, score-0.384]
49 1 A Combinatorial Algorithm In this section we analyze a combinatorial, strongly polynomial, algorithm for minimizing regret in the full information Online Submodular Minimization setting: Algorithm 1 Submodular Follow-The-Perturbed-Leader 1: Input: parameter η > 0. [sent-145, score-0.418]
50 3: for t = 1 to T do t−1 4: Use the set St = arg minS⊆[n] τ =1 fτ (S) + R(S), and obtain cost ft (St ). [sent-148, score-0.473]
51 Note that R is a submodular function, and Φt , being the sum of submodular functions, is itself submodular. [sent-150, score-0.812]
52 While the algorithm itself is a simple extension of Hannan’s [5] follow-the-perturbed-leader algorithm, previous analysis (such as Kalai and Vempala [9]), which rely on linearity of the cost functions, cannot be made to work here. [sent-153, score-0.194]
53 Instead, we introduce a new analysis technique: we divide the decision space using n different cuts so that any two decisions are separated by at least one cut, and then we give an upper bound on the probability that the chosen decision switches sides over each such cut. [sent-154, score-0.279]
54 We now prove the regret bound of Theorem 1: √ Theorem 4. [sent-156, score-0.368]
55 Algorithm 1 run with parameter η = 1/ T achieves the following regret bound: √ E[RegretT ] ≤ 6n T . [sent-157, score-0.329]
56 1 of [9], which bounds the regret as follows: T ft (St ) − ft (St+1 ) + R(S ∗ ) − R(S1 ). [sent-164, score-1.069]
57 RegretT ≤ t=1 Here, S ∗ = arg minS⊆[n] T t=1 ft (S). [sent-165, score-0.383]
58 To bound the expected regret, by linearity of expectation, it suffices to bound E[f (St ) − f (St+1 )], where for the purpose of analysis, we assume that we re-randomize in every round (i. [sent-166, score-0.199]
59 But if Φt (A) + ri < Φt (B) − 2, then we must have i ∈ St+1 , since for any C such that i ∈ C, / Φt+1 (A) = Φt (A) + ri + ft (A) < Φt (B) − 1 < Φt (C) + ft (C) = Φt (C). [sent-180, score-0.812]
60 The inequalities above use the fact that ft (S) ∈ [−1, 1] for all S ⊆ [n]. [sent-181, score-0.367]
61 We apply this technique to the Lov´ sz extension of the cost function coupled with a a simple randomized construction of the subgradient, as given in definition 2. [sent-191, score-0.418]
62 2: for t = 1 to T do 3: Choose a threshold τ ∈ [0, 1] uniformly at random, and use the set St = {i : xt (i) > τ } and obtain cost ft (St ). [sent-195, score-0.629]
63 4: Find a maximal chain associated with xt , ∅ = B0 ⊂ B1 ⊂ B2 ⊂ · · · Bn = [n], and use ˆ ft (B0 ), ft (B1 ), . [sent-196, score-1.026]
64 , ft (Bn ) to compute a subgradient gt of ft at xt as in part 2 of Proposition 3. [sent-199, score-1.124]
65 6: end for In the analysis of the algorithm, we need the following regret bound. [sent-201, score-0.31]
66 It is a simple extension of Zinkevich’s analysis of Online Gradient Descent to vector-valued random variables whose expectation is the subgradient of the cost function (the generality to random variables is not required for this section, but it will be useful in the next section): ˆ ˆ ˆ Lemma 6. [sent-202, score-0.239]
67 are vector-valued random variables such that E[ˆt |xt ] = gt , where gt is a subgradient of ft g Then the expected regret of playing x1 , x2 , . [sent-214, score-1.103]
68 We can now prove the following regret bound: 6 √ Theorem 7. [sent-219, score-0.31]
69 Algorithm 2, run with parameter η = 1/ T , achieves the following regret bound: √ E[RegretT ] ≤ 3n T . [sent-220, score-0.329]
70 Note that be Definition 2, we have that E[ft (St )] = ft (xt ). [sent-223, score-0.367]
71 Since the algorithm runs Online Gradient Descent (from Lemma 6) with gt = gt (i. [sent-224, score-0.363]
72 ˆ T E[RegretT ] = T T E[ft (St )] − min S⊆[n] t=1 f (S) ≤ t=1 ˆ ft (xt ) − min t=1 x∈K T n ˆ + 2ηnT. [sent-228, score-0.403]
73 fT (x) ≤ 2η t=1 √ Since η = 1/ T , we get the required regret bound. [sent-229, score-0.31]
74 Furthermore, by a simple Hoeffding bound, we also get that with probability at least 1 − ε, T T ft (St ) ≤ t=1 E[ft (St )] + 2T log(1/ε), t=1 which implies the high probability regret bound. [sent-230, score-0.677]
75 2: for t = 1 to T do 3: Find a maximal chain associated with xt , ∅ = B0 ⊂ B1 ⊂ B2 ⊂ · · · Bn = [n], and let π be the associated permutation as in part 2 of Proposition 3. [sent-236, score-0.292]
76 Then xt can be written as n xt = i=0 µi χBi , where µi = 0 for the extra sets Bi that were added to complete the maximal chain for xt . [sent-237, score-0.59]
77 1 If St = B0 , then set gt = − ρ0 ft (St )eπ(1) , and if St = Bn then set gt = ρ1 ft (St )eπ(n) . [sent-240, score-1.066]
78 Choose εt ∈ {+1, −1} uniformly at random, and set: 2 if εt = 1 ρi ft (St )eπ(i) gt = ˆ − 2 f (S )e if εt = −1 t π(i+1) ρi t 6: Update: set xt+1 = ΠK (xt − η gt ). [sent-242, score-0.722]
79 A first observation is that on the expectation, the regret of the algorithm above is almost the same as if it had played xt all along and the loss functions were replaced by the Lov´ sz extensions of the a actual loss functions. [sent-247, score-0.73]
80 On the other hand, Et [f (St )] = n ˆ Et [f (St )] − ft (xt ) = n (ρi − µi )f (Bi ) ≤ δ i=0 i=0 1 + µi |f (Bi )| ≤ 2δ. [sent-252, score-0.367]
81 Next, by Proposition 3, the subgradient of the Lov´ sz extension of ft at point xt corresponding to a the maximal chain B0 ⊂ B1 ⊂ · · · ⊂ Bn is given by gt (i) = f (Bπ(i) ) − f (Bπ(i)−1 ). [sent-254, score-1.173]
82 Using this fact, it is easy to check that the random vector gt is constructed in such a way that E[ˆt |xt ] = gt . [sent-255, score-0.332]
83 ˆ g Furthermore, we can bound the norm of this estimator as follows: n 2 Et [ gt ] ≤ ˆ i=0 4(n + 1)2 16n2 4 2 ≤ . [sent-256, score-0.224]
84 (2) 1 , T 2/3 achieves the following regret Proof. [sent-260, score-0.31]
85 We bound the expected regret as follows: T T E[ft (St )] − min t=1 S⊆[n] T ft (S) ≤ 2δT + t=1 η n + ≤ 2δT + 2η 2 The bound is now obtained for δ = 4. [sent-261, score-0.83]
86 1 x∈K t=1 ≤ 2δT + n , T 1/3 η= T ˆ E[ft (xt )] − min ˆ ft (x) (By Lemma 8) t=1 T E[ gt 2 ] ˆ (By Lemma 6) t=1 2 8n ηT n + . [sent-262, score-0.551]
87 T 2/3 High probability bounds on the regret The theorem of the previous section gave a bound on the expected regret. [sent-264, score-0.436]
88 However, a much stronger claim can be made that essentially the same regret bound holds with very high probability (exponential tail). [sent-265, score-0.368]
89 With probability 1 − 4ε, Algorithm 3, run with parameters δ = T 1/3 , η = T 2/3 , achieves the following regret bound: RegretT ≤ O(nT 2/3 log(1/ε)). [sent-269, score-0.329]
90 5 Conclusions and Open Questions We have described efficient regret minimization algorithms for submodular cost functions, in both the bandit and full information settings. [sent-271, score-1.04]
91 This parallels the work of Streeter and Golovin [14] who study two specific instances of online submodular maximization (for which the offline problem is NP-hard), and give (approximate) regret minimizing algorithms. [sent-272, score-0.873]
92 An open question is whether it is √ possible to attain O( T ) regret bounds for online submodular minimization in the bandit setting. [sent-273, score-1.082]
93 McMahan, Online convex optimization in the bandit setting: gradient descent without a gradient, SODA, 2005, pp. [sent-280, score-0.234]
94 [6] Satoru Iwata, A faster scaling algorithm for minimizing submodular functions, SIAM J. [sent-295, score-0.457]
95 [7] Satoru Iwata, Lisa Fleischer, and Satoru Fujishige, A combinatorial strongly polynomial algorithm for minimizing submodular functions, J. [sent-299, score-0.601]
96 Orlin, A simple combinatorial algorithm for submodular function minimization, SODA ’09: Proceedings of the Nineteenth Annual ACM -SIAM Symposium on Discrete Algorithms (Philadelphia, PA, USA), Society for Industrial and Applied Mathematics, 2009, pp. [sent-302, score-0.491]
97 [9] Adam Kalai and Santosh Vempala, Efficient algorithms for online decision problems, Journal of Computer and System Sciences 71(3) (2005), 291–307. [sent-304, score-0.21]
98 Orlin, A faster strongly polynomial time algorithm for submodular function minimization, Math. [sent-320, score-0.527]
99 [13] Alexander Schrijver, A combinatorial algorithm minimizing submodular functions in strongly polynomial time, 1999. [sent-324, score-0.622]
100 Streeter and Daniel Golovin, An online algorithm for maximizing submodular functions, NIPS, 2008, pp. [sent-326, score-0.549]
wordName wordTfidf (topN-words)
[('st', 0.434), ('submodular', 0.406), ('ft', 0.367), ('regret', 0.31), ('lov', 0.255), ('regrett', 0.227), ('sz', 0.219), ('gt', 0.166), ('xt', 0.149), ('bandit', 0.143), ('online', 0.112), ('maker', 0.097), ('cost', 0.09), ('chain', 0.081), ('decision', 0.079), ('pr', 0.079), ('subgradient', 0.075), ('satoru', 0.071), ('bi', 0.063), ('maximal', 0.062), ('bn', 0.061), ('zinkevich', 0.061), ('bound', 0.058), ('iwata', 0.057), ('extension', 0.054), ('combinatorial', 0.054), ('polynomial', 0.053), ('minimization', 0.052), ('rj', 0.049), ('hannan', 0.049), ('orlin', 0.049), ('ap', 0.047), ('round', 0.045), ('randomness', 0.044), ('eo', 0.043), ('fujishige', 0.043), ('kalai', 0.042), ('descent', 0.042), ('lemma', 0.041), ('ai', 0.04), ('ri', 0.039), ('nt', 0.039), ('subsets', 0.039), ('randomized', 0.038), ('strongly', 0.037), ('oracle', 0.037), ('pro', 0.036), ('distributive', 0.032), ('flaxman', 0.032), ('schrijver', 0.032), ('algorithm', 0.031), ('golovin', 0.028), ('lattices', 0.028), ('streeter', 0.028), ('gradient', 0.028), ('mins', 0.026), ('oblivious', 0.026), ('oracles', 0.026), ('mcmahan', 0.026), ('vempala', 0.026), ('producing', 0.026), ('give', 0.025), ('bounds', 0.025), ('theorem', 0.024), ('uniformly', 0.023), ('item', 0.023), ('elsevier', 0.023), ('proposition', 0.022), ('advice', 0.022), ('choose', 0.022), ('feedback', 0.022), ('functions', 0.021), ('access', 0.021), ('convex', 0.021), ('market', 0.021), ('exploitation', 0.021), ('decisions', 0.021), ('expectation', 0.02), ('hazan', 0.02), ('soda', 0.02), ('minimizing', 0.02), ('full', 0.02), ('algorithms', 0.019), ('run', 0.019), ('linearity', 0.019), ('sub', 0.019), ('expected', 0.019), ('nition', 0.019), ('min', 0.018), ('attains', 0.018), ('chose', 0.018), ('fastest', 0.018), ('attain', 0.018), ('subset', 0.017), ('technique', 0.017), ('sides', 0.017), ('arg', 0.016), ('open', 0.016), ('furthermore', 0.016), ('chapter', 0.016)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 45 nips-2009-Beyond Convexity: Online Submodular Minimization
Author: Elad Hazan, Satyen Kale
Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. 1
2 0.35755974 239 nips-2009-Submodularity Cuts and Applications
Author: Yoshinobu Kawahara, Kiyohito Nagano, Koji Tsuda, Jeff A. Bilmes
Abstract: Several key problems in machine learning, such as feature selection and active learning, can be formulated as submodular set function maximization. We present herein a novel algorithm for maximizing a submodular set function under a cardinality constraint — the algorithm is based on a cutting-plane method and is implemented as an iterative small-scale binary-integer linear programming procedure. It is well known that this problem is NP-hard, and the approximation factor achieved by the greedy algorithm is the theoretical limit for polynomial time. As for (non-polynomial time) exact algorithms that perform reasonably in practice, there has been very little in the literature although the problem is quite important for many applications. Our algorithm is guaranteed to find the exact solution finitely many iterations, and it converges fast in practice due to the efficiency of the cutting-plane mechanism. Moreover, we also provide a method that produces successively decreasing upper-bounds of the optimal solution, while our algorithm provides successively increasing lower-bounds. Thus, the accuracy of the current solution can be estimated at any point, and the algorithm can be stopped early once a desired degree of tolerance is met. We evaluate our algorithm on sensor placement and feature selection applications showing good performance. 1
3 0.35071695 181 nips-2009-Online Learning of Assignments
Author: Matthew Streeter, Daniel Golovin, Andreas Krause
Abstract: Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. We present an efficient algorithm for this general problem and analyze it in the no-regret model. Our algorithm possesses strong theoretical guarantees, such as a performance ratio that converges to the optimal constant of 1 − 1/e. We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades. 1
4 0.30447283 178 nips-2009-On Stochastic and Worst-case Models for Investing
Author: Elad Hazan, Satyen Kale
Abstract: In practice, most investing is done assuming a probabilistic model of stock price returns known as the Geometric Brownian Motion (GBM). While often an acceptable approximation, the GBM model is not always valid empirically. This motivates a worst-case approach to investing, called universal portfolio management, where the objective is to maximize wealth relative to the wealth earned by the best fixed portfolio in hindsight. In this paper we tie the two approaches, and design an investment strategy which is universal in the worst-case, and yet capable of exploiting the mostly valid GBM model. Our method is based on new and improved regret bounds for online convex optimization with exp-concave loss functions. 1
5 0.27284831 220 nips-2009-Slow Learners are Fast
Author: Martin Zinkevich, John Langford, Alex J. Smola
Abstract: Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online learning. 1
6 0.19302544 24 nips-2009-Adapting to the Shifting Intent of Search Queries
7 0.16708125 14 nips-2009-A Parameter-free Hedging Algorithm
8 0.16637769 215 nips-2009-Sensitivity analysis in HMMs with application to likelihood maximization
9 0.14159453 63 nips-2009-DUOL: A Double Updating Approach for Online Learning
10 0.1150302 156 nips-2009-Monte Carlo Sampling for Regret Minimization in Extensive Games
11 0.10254815 48 nips-2009-Bootstrapping from Game Tree Search
12 0.096848927 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization
13 0.095889516 171 nips-2009-Nonparametric Bayesian Models for Unsupervised Event Coreference Resolution
14 0.092336766 76 nips-2009-Efficient Learning using Forward-Backward Splitting
15 0.089387558 122 nips-2009-Label Selection on Graphs
16 0.084817186 22 nips-2009-Accelerated Gradient Methods for Stochastic Optimization and Online Learning
17 0.077288352 216 nips-2009-Sequential effects reflect parallel learning of multiple environmental regularities
18 0.075453371 202 nips-2009-Regularized Distance Metric Learning:Theory and Algorithm
19 0.069111452 27 nips-2009-Adaptive Regularization of Weight Vectors
20 0.069001496 101 nips-2009-Generalization Errors and Learning Curves for Regression with Multi-task Gaussian Processes
topicId topicWeight
[(0, -0.17), (1, 0.269), (2, 0.196), (3, -0.196), (4, 0.248), (5, 0.288), (6, -0.105), (7, 0.083), (8, -0.077), (9, 0.061), (10, -0.197), (11, -0.024), (12, 0.012), (13, 0.06), (14, -0.128), (15, 0.395), (16, 0.045), (17, 0.048), (18, -0.05), (19, 0.028), (20, -0.12), (21, 0.057), (22, -0.017), (23, -0.115), (24, -0.04), (25, -0.034), (26, -0.013), (27, 0.059), (28, -0.026), (29, 0.074), (30, 0.036), (31, 0.031), (32, -0.005), (33, 0.017), (34, 0.015), (35, 0.025), (36, 0.011), (37, 0.011), (38, 0.046), (39, -0.001), (40, -0.015), (41, 0.089), (42, -0.027), (43, 0.009), (44, 0.029), (45, -0.014), (46, -0.027), (47, -0.012), (48, -0.001), (49, -0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.96807337 45 nips-2009-Beyond Convexity: Online Submodular Minimization
Author: Elad Hazan, Satyen Kale
Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. 1
2 0.84209132 181 nips-2009-Online Learning of Assignments
Author: Matthew Streeter, Daniel Golovin, Andreas Krause
Abstract: Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. We present an efficient algorithm for this general problem and analyze it in the no-regret model. Our algorithm possesses strong theoretical guarantees, such as a performance ratio that converges to the optimal constant of 1 − 1/e. We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades. 1
3 0.68529153 239 nips-2009-Submodularity Cuts and Applications
Author: Yoshinobu Kawahara, Kiyohito Nagano, Koji Tsuda, Jeff A. Bilmes
Abstract: Several key problems in machine learning, such as feature selection and active learning, can be formulated as submodular set function maximization. We present herein a novel algorithm for maximizing a submodular set function under a cardinality constraint — the algorithm is based on a cutting-plane method and is implemented as an iterative small-scale binary-integer linear programming procedure. It is well known that this problem is NP-hard, and the approximation factor achieved by the greedy algorithm is the theoretical limit for polynomial time. As for (non-polynomial time) exact algorithms that perform reasonably in practice, there has been very little in the literature although the problem is quite important for many applications. Our algorithm is guaranteed to find the exact solution finitely many iterations, and it converges fast in practice due to the efficiency of the cutting-plane mechanism. Moreover, we also provide a method that produces successively decreasing upper-bounds of the optimal solution, while our algorithm provides successively increasing lower-bounds. Thus, the accuracy of the current solution can be estimated at any point, and the algorithm can be stopped early once a desired degree of tolerance is met. We evaluate our algorithm on sensor placement and feature selection applications showing good performance. 1
4 0.63001376 178 nips-2009-On Stochastic and Worst-case Models for Investing
Author: Elad Hazan, Satyen Kale
Abstract: In practice, most investing is done assuming a probabilistic model of stock price returns known as the Geometric Brownian Motion (GBM). While often an acceptable approximation, the GBM model is not always valid empirically. This motivates a worst-case approach to investing, called universal portfolio management, where the objective is to maximize wealth relative to the wealth earned by the best fixed portfolio in hindsight. In this paper we tie the two approaches, and design an investment strategy which is universal in the worst-case, and yet capable of exploiting the mostly valid GBM model. Our method is based on new and improved regret bounds for online convex optimization with exp-concave loss functions. 1
5 0.61874467 220 nips-2009-Slow Learners are Fast
Author: Martin Zinkevich, John Langford, Alex J. Smola
Abstract: Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online learning. 1
6 0.61073202 24 nips-2009-Adapting to the Shifting Intent of Search Queries
7 0.46811497 63 nips-2009-DUOL: A Double Updating Approach for Online Learning
8 0.44898805 14 nips-2009-A Parameter-free Hedging Algorithm
9 0.41312331 73 nips-2009-Dual Averaging Method for Regularized Stochastic Learning and Online Optimization
10 0.36597541 215 nips-2009-Sensitivity analysis in HMMs with application to likelihood maximization
11 0.35404572 48 nips-2009-Bootstrapping from Game Tree Search
12 0.34625322 76 nips-2009-Efficient Learning using Forward-Backward Splitting
13 0.33849999 234 nips-2009-Streaming k-means approximation
14 0.31340677 122 nips-2009-Label Selection on Graphs
15 0.29855579 27 nips-2009-Adaptive Regularization of Weight Vectors
16 0.29274288 156 nips-2009-Monte Carlo Sampling for Regret Minimization in Extensive Games
17 0.28140095 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization
18 0.27531931 159 nips-2009-Multi-Step Dyna Planning for Policy Evaluation and Control
19 0.26558286 11 nips-2009-A General Projection Property for Distribution Families
20 0.24575707 233 nips-2009-Streaming Pointwise Mutual Information
topicId topicWeight
[(24, 0.183), (25, 0.056), (35, 0.044), (36, 0.104), (39, 0.032), (54, 0.248), (58, 0.076), (61, 0.01), (71, 0.06), (81, 0.015), (86, 0.041), (91, 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 0.82981724 45 nips-2009-Beyond Convexity: Online Submodular Minimization
Author: Elad Hazan, Satyen Kale
Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. 1
2 0.73033577 189 nips-2009-Periodic Step Size Adaptation for Single Pass On-line Learning
Author: Chun-nan Hsu, Yu-ming Chang, Hanshen Huang, Yuh-jye Lee
Abstract: It has been established that the second-order stochastic gradient descent (2SGD) method can potentially achieve generalization performance as well as empirical optimum in a single pass (i.e., epoch) through the training examples. However, 2SGD requires computing the inverse of the Hessian matrix of the loss function, which is prohibitively expensive. This paper presents Periodic Step-size Adaptation (PSA), which approximates the Jacobian matrix of the mapping function and explores a linear relation between the Jacobian and Hessian to approximate the Hessian periodically and achieve near-optimal results in experiments on a wide variety of models and tasks. 1
3 0.7092734 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing
Author: Sundeep Rangan, Vivek Goyal, Alyson K. Fletcher
Abstract: The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other benefits, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.
4 0.69709843 168 nips-2009-Non-stationary continuous dynamic Bayesian networks
Author: Marco Grzegorczyk, Dirk Husmeier
Abstract: Dynamic Bayesian networks have been applied widely to reconstruct the structure of regulatory processes from time series data. The standard approach is based on the assumption of a homogeneous Markov chain, which is not valid in many realworld scenarios. Recent research efforts addressing this shortcoming have considered undirected graphs, directed graphs for discretized data, or over-flexible models that lack any information sharing among time series segments. In the present article, we propose a non-stationary dynamic Bayesian network for continuous data, in which parameters are allowed to vary among segments, and in which a common network structure provides essential information sharing across segments. Our model is based on a Bayesian multiple change-point process, where the number and location of the change-points is sampled from the posterior distribution. 1
5 0.69469118 15 nips-2009-A Rate Distortion Approach for Semi-Supervised Conditional Random Fields
Author: Yang Wang, Gholamreza Haffari, Shaojun Wang, Greg Mori
Abstract: We propose a novel information theoretic approach for semi-supervised learning of conditional random fields that defines a training objective to combine the conditional likelihood on labeled data and the mutual information on unlabeled data. In contrast to previous minimum conditional entropy semi-supervised discriminative learning methods, our approach is grounded on a more solid foundation, the rate distortion theory in information theory. We analyze the tractability of the framework for structured prediction and present a convergent variational training algorithm to defy the combinatorial explosion of terms in the sum over label configurations. Our experimental results show the rate distortion approach outperforms standard l2 regularization, minimum conditional entropy regularization as well as maximum conditional entropy regularization on both multi-class classification and sequence labeling problems. 1
6 0.69379872 221 nips-2009-Solving Stochastic Games
7 0.68507051 240 nips-2009-Sufficient Conditions for Agnostic Active Learnable
8 0.68382919 181 nips-2009-Online Learning of Assignments
9 0.67857301 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization
10 0.67635226 144 nips-2009-Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness
11 0.64439714 161 nips-2009-Nash Equilibria of Static Prediction Games
12 0.64115554 239 nips-2009-Submodularity Cuts and Applications
13 0.63331872 156 nips-2009-Monte Carlo Sampling for Regret Minimization in Extensive Games
14 0.62465054 122 nips-2009-Label Selection on Graphs
15 0.62334061 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers
16 0.62165684 14 nips-2009-A Parameter-free Hedging Algorithm
17 0.62064439 94 nips-2009-Fast Learning from Non-i.i.d. Observations
18 0.62029511 12 nips-2009-A Generalized Natural Actor-Critic Algorithm
19 0.61865485 55 nips-2009-Compressed Least-Squares Regression
20 0.6148243 91 nips-2009-Fast, smooth and adaptive regression in metric spaces