nips nips2009 nips2009-181 knowledge-graph by maker-knowledge-mining

181 nips-2009-Online Learning of Assignments

Source: pdf

Author: Matthew Streeter, Daniel Golovin, Andreas Krause

Abstract: Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. We present an efﬁcient algorithm for this general problem and analyze it in the no-regret model. Our algorithm possesses strong theoretical guarantees, such as a performance ratio that converges to the optimal constant of 1 − 1/e. We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Which ads should we display in sponsored search in order to maximize our revenue? [sent-6, score-0.415]

2 These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. [sent-8, score-0.283]

3 We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. [sent-9, score-0.905]

4 We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades. [sent-12, score-0.94]

5 1 Introduction Consider the problem of repeatedly choosing advertisements to display in sponsored search to maximize our revenue. [sent-13, score-0.336]

6 In this and related problems that we call online assignment learning problems, there is a set of positions, a set of items, and a sequence of rounds, and on each round we must assign an item to each position. [sent-15, score-0.391]

7 When there is only one position, this problem becomes the well-studied multiarmed bandit problem [2]. [sent-17, score-0.283]

8 When the positions have a linear ordering the assignment can be construed as a ranked list of elements, and the problem becomes one of selecting lists online. [sent-18, score-0.219]

9 Online assignment learning thus models a central challenge in web search, sponsored search, news aggregators, and recommendation systems, among other applications. [sent-19, score-0.296]

10 This will be the case if there are diminishing returns on the number of relevant links presented to a user; for example, if it is better to present each user with at least one relevant result than to present half of the users with no relevant results and half with two relevant results. [sent-24, score-0.259]

11 We incorporate these considerations in a ﬂexible way by providing an algorithm that performs well whenever the reward for an assignment is a monotone submodular function of its set of (item, position) pairs. [sent-25, score-0.879]

12 1 2 The assignment learning problem We consider problems, where we have K positions (e. [sent-27, score-0.219]

13 We address both the ofﬂine problem, where the utility function is speciﬁed in advance, and the online problem, where a sequence of utility functions arrives over time, and we need to repeatedly select a new assignment. [sent-34, score-0.456]

14 We call an assignment feasible, if at most one item is assigned to each position (i. [sent-42, score-0.292]

15 Our goal is to ﬁnd a feasible assignment maximizing a utility function f : 2V → R≥0 . [sent-46, score-0.319]

16 As we discuss later, many important assignment problems satisfy submodularity, a natural diminishing returns property: Assigning a new item to a position k increases the utility more if few elements have been assigned yet, and less if many items have already been assigned. [sent-47, score-0.512]

17 Our goal is thus, for a given non-negative, monotone and submodular utility function f , to ﬁnd a feasible assignment S ∗ of maximum utility, S ∗ = arg maxS∈P {f (S)}. [sent-52, score-0.847]

18 , corresponds to a user query for a particular term), we want to select an assignment St (ads to display). [sent-63, score-0.242]

19 After we select the assignment we obtain reward ft (St ) for some non-negative monotone submodular utility function ft . [sent-68, score-1.436]

20 We call the setting where we do not get any information about ft beyond the reward the bandit feedback model. [sent-69, score-0.664]

21 In contrast, in the full-information feedback model we obtain oracle access to ft (i. [sent-70, score-0.272]

22 The goal is to maximize the total reward we obtain, namely t ft (St ). [sent-74, score-0.411]

23 Following the multiarmed bandit literature, we evaluate our performance after T rounds by comparing our total reward against that obtained by a clairvoyant algorithm with knowledge of the sequence of functions f1 , . [sent-75, score-0.58]

24 The difference between the clairvoyant algorithm’s total reward and ours is called our regret. [sent-79, score-0.204]

25 However, since sums of submodular functions remain submodular, the clairvoyant algorithm has to solve an ofﬂine assignment problem with f (S) = t ft (S). [sent-81, score-0.778]

26 To accommodate this fact, we discount the reward of the clairvoyant algorithm by a factor of (1 − 1/e): We deﬁne the (1 − 1/e)-regret of a random sequence S1 , . [sent-83, score-0.261]

27 , ST as 1− 1 e T · max S∈P T ft (S) t=1 − E ft (St ) . [sent-86, score-0.434]

28 Our model generalizes several common models for sponsored search ad selection, and web search results. [sent-90, score-0.366]

29 These include models with click-through-rates, in which it is assumed that each (ad, position) pair has some probability p(a, k) of being clicked on, and there is some monetary reward b(a) that is obtained whenever ad a is clicked on. [sent-91, score-0.331]

30 See [7, 12] for more details on sponsored search ad allocation. [sent-93, score-0.299]

31 It is easy to verify that such a reward function is monotone submodular. [sent-95, score-0.372]

32 For example, we can handle the case that each user u is interested in a particular set Au of ads and looks at a set Iu of positions, and the reward of an assignment S is any monotoneincreasing concave function g of |S ∩ (Au × Iu )|. [sent-101, score-0.497]

33 1 An approximation algorithm for the ofﬂine problem The locally greedy algorithm A simple approach to the assignment problem is the following greedy procedure: the algorithm steps through all K positions (according to some ﬁxed, arbitrary ordering). [sent-108, score-0.529]

34 Perhaps surprisingly, no matter which ordering over the positions is chosen, this so-called locally greedy algorithm produces an assignment that obtains at least half the optimal value [8]. [sent-115, score-0.438]

35 We will use this lemma in the analysis of our improved ofﬂine algorithm, which uses the locally greedy algorithm as a subroutine. [sent-117, score-0.196]

36 Suppose f : 2V → R≥0 is of the form f (S) = f0 (S) + k=1 fk (S ∩ Pk ) where f0 : 2V → R≥0 is monotone submodular, and fk : 2Pk → R≥0 is arbitrary for k ≥ 1. [sent-119, score-0.32]

37 2 An algorithm with optimal approximation ratio We now present an algorithm that achieves the optimal approximation ratio of 1 − 1/e, improving on the 1 approximation for the locally greedy algorithm. [sent-126, score-0.263]

38 , cK ), deﬁne samplec (S) = k=1 {x ∈ Pk : (x, ck ) ∈ S}. [sent-134, score-0.263]

39 Given a set S of (item, color) pairs, which we may think of as labeling each item with one or more colors, samplec (S) returns a set containing each item x that is labeled with whatever color c assigns to the partition that contains x. [sent-135, score-0.451]

40 The latter process, in turn, can be shown to be equivalent to a greedy algorithm for maximizing a (different) submodular function subject to a cardinality constraint, which implies that it achieves a 1 − 1/e approximation ratio [15]. [sent-155, score-0.54]

41 We will prove the lemma in three steps: (i) H is monotone submodular, (ii) TABULAR G REEDY is simply the locally greedy algorithm for ﬁnding a set of C rows that maximizes H, where the cth greedy step is performed with additive error Ec , and (iii) TABULAR G REEDY obtains the guarantee (3. [sent-183, score-0.534]

42 To show that H is monotone submodular, it sufﬁces to show that F is monotone submodular. [sent-185, score-0.444]

43 Because F (S) = Ec [f (samplec (S))], and because a convex combination of monotone submodular functions is monotone submodular, it sufﬁces to show that for any particular coloring c, the function f (samplec (S)) is monotone submodular. [sent-186, score-1.013]

44 This follows from the deﬁnition of sample and the fact that f is monotone submodular. [sent-187, score-0.222]

45 For problems with this special property, it is 4 1 known that the locally greedy algorithm obtains an approximation ratio of β(C) = 1 − (1 − C )C [15]. [sent-190, score-0.227]

46 This follows from the fact that for any assignment S ∈ P, we can ﬁnd a set R(S) of C rows such that samplec ( R∈R(S) R) = S with probability 1, and therefore H(R(S)) = f (S). [sent-196, score-0.313]

47 Additionally, f0 (R) = P [N ≥ 2] Ec [∆c (R)|N ≥ 2] is a monotone submodular function of a set of (item, color) pairs, for the same reasons F is. [sent-207, score-0.569]

48 c 4 An algorithm for online learning of assignments We now transform the ofﬂine algorithm of §3. [sent-212, score-0.234]

49 The high-level idea behind this transformation is to replace each greedy decision made by the ofﬂine algorithm with a no-regret online algorithm. [sent-214, score-0.235]

50 A similar approach was used in [16] and [18] to obtain an online algorithm for different (simpler) online problems. [sent-215, score-0.269]

51 Let rk,c be the regret of Ek,c , and let β(K, C) = 1 − 1 − T T ft (Gt ) ≥ β(K, C) · max E t=1 K ft (S) S∈P t=1 5 1 C C − K 2 C −1 . [sent-227, score-0.495]

52 If TG BANDIT is run with randomized weighted majority [5] as the subroutine, then T T ft (Gt ) ≥ β(K, C) · max E S∈P t=1 where β(K, C) = 1 − 1 − 1 C C − K 2 K ft (S) t=1 −O C T log |Pk | . [sent-235, score-0.434]

53 √ ˜ Optimizing for C in Corollary 7 yields (1 − 1 )-regret Θ(K 3/2 T 1/4 OPT) ignoring logarithmic e factors, where OPT := maxS∈P T t=1 ft (S) is the value of the static optimum. [sent-237, score-0.217]

54 TG BANDIT can be modiﬁed to work in the bandit feedback model. [sent-239, score-0.297]

55 By using their algorithm as a subroutine within TG BANDIT, we get similar guarantees, both in the full information and bandit feedback models. [sent-251, score-0.33]

56 5 Evaluation We evaluate TG BANDIT experimentally on two applications: Learning to rank blogs that are effective in detecting cascades of information, and allocating advertisements to maximize revenue. [sent-252, score-0.388]

57 1 Online learning of diverse blog rankings We consider the problem of ranking a set of blogs and news sources on the web. [sent-254, score-0.437]

58 Based on this notion of an information cascade, we would like to select blogs that detect big cascades (containing many nodes) as early as possible (i. [sent-257, score-0.351]

59 In [13] it is shown how one can formalize this notion of utility using a monotone submodular function that measures the informativeness of a subset of blogs. [sent-260, score-0.693]

60 Optimizing the submodular function yields a small set of blogs that “covers” most cascades. [sent-261, score-0.579]

61 The work by [13] leaves two major shortcomings: Firstly, they select a set of blogs rather than a ranking, which is of practical importance for the presentation on a web service. [sent-263, score-0.296]

62 Secondly, they do not address the problem of sequential prediction, where the set of blogs must be updated dynamically over time. [sent-264, score-0.257]

63 (c) Performance of TG BANDIT with C = 1, 2, and 4 colors for the sponsored search ad selection problem (each round is a query). [sent-283, score-0.453]

64 In order to model the blog ranking problem, we adopt the assumption that different users have different attention spans: Each user will only consider blogs appearing in a particular subset of positions. [sent-286, score-0.513]

65 More formally, let g be the monotone submodular function measuring the informativeness of any set of blogs, deﬁned as in [13]. [sent-288, score-0.569]

66 ∪ Pk } be the assignment of blogs to positions 1 through k. [sent-293, score-0.451]

67 It V can be seen that f : 2 → R≥0 is monotone submodular. [sent-295, score-0.222]

68 Based on this data, we run our TABULAR G REEDY algorithm with varying numbers of colors C on the blog data set. [sent-299, score-0.204]

69 We now consider the online problem where on each round t we want to output a ranking St . [sent-305, score-0.233]

70 After we select the ranking, a new set of cascades occurs, modeled using a separate submodular function ft , and we obtain a reward of ft (St ). [sent-306, score-1.05]

71 In our experiments, we choose one assignment per day, and deﬁne ft as the utility associated with the cascades occurring on that day. [sent-307, score-0.555]

72 Note that ft allows us to evaluate the performance of any possible ranking St , hence we can apply TG BANDIT in the full-information feedback model. [sent-308, score-0.33]

73 We normalize the average reward by the utility achieved by the TABULAR G REEDY algorithm (with C = 1) applied to the entire data set. [sent-312, score-0.307]

74 The TG BANDIT algorithm with C = 4 levels out at an approximately 4% higher reward than the algorithm with C = 1. [sent-315, score-0.216]

75 2 Online ad display We evaluate TG BANDIT for the sponsored search ad selection problem in a simple Markovian model incorporating the value of diverse results and complex position-dependence among clicks. [sent-317, score-0.525]

76 In this model, each user u is deﬁned by two sets of probabilities: pclick (a) for each ad a ∈ A, and pabandon (k) for each position k ∈ [K]. [sent-318, score-0.549]

77 , aK }, where ak occupies position k, the user scans the positions in increasing order. [sent-322, score-0.294]

78 For each position k, the user clicks on ak with probability pclick (ak ), leaving the results page forever. [sent-323, score-0.376]

79 Otherwise, with probability (1 − pclick (ak )) · pabandon (k), the user loses interest and abandons the results without clicking on anything. [sent-324, score-0.346]

80 Finally, with probability (1 − pclick (ak )) · (1 − pabandon (k)), the user proceeds to look at position k + 1. [sent-325, score-0.422]

81 The reward function ft is the number of clicks, which is either zero or one. [sent-326, score-0.367]

82 7 In our evaluation, there are 5 positions, 20 available ads, and two (equally frequent) types of users: type 1 users interested in all positions (pabandon ≡ 0), and type 2 users that quickly lose interest (pabandon ≡ 0. [sent-330, score-0.224]

83 There are also two types of ads, half of type 1 and half of type 2, and users are probabilistically more interested in ads of their own type than those of the opposite type. [sent-332, score-0.257]

84 Speciﬁcally, for both types of users we set pclick (a) = 0. [sent-333, score-0.205]

85 It can be shown that with several different types of users with distinct pclick (·) functions the ofﬂine problem of ﬁnding an assignment within 1 − 1 + ε of optimal is NP-hard. [sent-340, score-0.332]

86 This is in contrast to the e case in which pclick and pabandon are the same for all users; in this case the ofﬂine problem simply requires ﬁnding an optimal policy for a Markov decision process, which can be done efﬁciently using well-known algorithms. [sent-341, score-0.263]

87 In that model, pclick and pabandon are the same for all users, and pabandon is a function of the ad in the slot currently being scanned rather than its index. [sent-343, score-0.514]

88 6 Related Work For a general introduction to the literature on submodular function maximization, see [19]. [sent-344, score-0.347]

89 Our ofﬂine problem is known as maximizing a monotone submodular function subject to a (simple) partition matroid constraint in the operations research and theoretical computer science communities. [sent-346, score-0.656]

90 In our context, this means it must have / the ability to ask (for example) what the revenue will be if ads a1 and a2 are placed in position #1 simultaneously. [sent-351, score-0.248]

91 Like us, they consider sequences of monotone submodular reward functions that arrive online, and develop an online algorithm that uses multi-armed bandit algorithms as subroutines. [sent-354, score-1.112]

92 The key difference from our work is that, as in [16], they are concerned with selecting a set of K items rather than the more general problem of selecting an assignment of items to positions addressed in this paper. [sent-355, score-0.299]

93 However, their result is orthogonal to ours, because our objective function is submodular and not linear2 . [sent-358, score-0.347]

94 7 Conclusions In this paper, we showed that important problems, such as ad display in sponsored search and computing diverse rankings of information sources on the web, require optimizing assignments under submodular utility functions. [sent-359, score-0.955]

95 Finally, we demonstrated that our algorithm outperforms previous work on two real world problems, namely online ranking of informative blogs and ad allocation. [sent-362, score-0.568]

96 2 One may linearize a submodular function by using a separate dimension for every possible function argument, but this leads to exponentially worse convergence time and regret bounds for the algorithms in [10] relative to TG BANDIT. [sent-365, score-0.408]

97 Maximizing a submodular set function a a subject to a matroid constraint. [sent-378, score-0.393]

98 An analysis of approximations for maximizing submodular set functions - II. [sent-401, score-0.388]

99 An analysis of approximations for maximizing submodular set functions - I. [sent-437, score-0.388]

100 Optimal approximation for the submodular welfare problem in the value oracle model. [sent-452, score-0.378]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('submodular', 0.347), ('bandit', 0.242), ('blogs', 0.232), ('monotone', 0.222), ('ft', 0.217), ('pk', 0.212), ('reedy', 0.201), ('tg', 0.199), ('tabular', 0.19), ('samplec', 0.186), ('ec', 0.151), ('reward', 0.15), ('gc', 0.14), ('pclick', 0.139), ('ads', 0.137), ('sponsored', 0.137), ('assignment', 0.127), ('ad', 0.127), ('utility', 0.124), ('pabandon', 0.124), ('ine', 0.119), ('online', 0.118), ('maxs', 0.108), ('vondr', 0.108), ('colors', 0.097), ('positions', 0.092), ('item', 0.089), ('cascades', 0.087), ('greedy', 0.084), ('user', 0.083), ('rc', 0.081), ('ck', 0.077), ('position', 0.076), ('blog', 0.074), ('users', 0.066), ('greedily', 0.063), ('color', 0.063), ('display', 0.062), ('iu', 0.062), ('regret', 0.061), ('rounds', 0.06), ('ranking', 0.058), ('round', 0.057), ('feedback', 0.055), ('clairvoyant', 0.054), ('streeter', 0.054), ('st', 0.053), ('submodularity', 0.05), ('assignments', 0.05), ('fk', 0.049), ('gt', 0.048), ('matroid', 0.046), ('krause', 0.044), ('maximize', 0.044), ('locally', 0.043), ('ak', 0.043), ('fc', 0.042), ('maximizing', 0.041), ('multiarmed', 0.041), ('golovin', 0.041), ('maxr', 0.041), ('items', 0.04), ('andreas', 0.039), ('diverse', 0.037), ('rankings', 0.036), ('lemma', 0.036), ('search', 0.035), ('revenue', 0.035), ('clicks', 0.035), ('ratio', 0.035), ('jan', 0.034), ('matthew', 0.033), ('repeatedly', 0.033), ('algorithm', 0.033), ('daniel', 0.033), ('web', 0.032), ('obtains', 0.032), ('select', 0.032), ('diminishing', 0.032), ('laurence', 0.031), ('mirrokni', 0.031), ('subsumed', 0.031), ('welfare', 0.031), ('ces', 0.029), ('blum', 0.028), ('feasible', 0.027), ('clicked', 0.027), ('posting', 0.027), ('postings', 0.027), ('half', 0.027), ('dynamically', 0.025), ('marshall', 0.025), ('advertisements', 0.025), ('jon', 0.025), ('nemhauser', 0.025), ('displaying', 0.025), ('arrives', 0.025), ('discount', 0.024), ('returns', 0.024), ('sublinearly', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 181 nips-2009-Online Learning of Assignments

Author: Matthew Streeter, Daniel Golovin, Andreas Krause

2 0.35071695 45 nips-2009-Beyond Convexity: Online Submodular Minimization

Author: Elad Hazan, Satyen Kale

Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efﬁcient and are Hannan-consistent in both the full information and bandit settings. 1

3 0.25570643 239 nips-2009-Submodularity Cuts and Applications

Author: Yoshinobu Kawahara, Kiyohito Nagano, Koji Tsuda, Jeff A. Bilmes

Abstract: Several key problems in machine learning, such as feature selection and active learning, can be formulated as submodular set function maximization. We present herein a novel algorithm for maximizing a submodular set function under a cardinality constraint — the algorithm is based on a cutting-plane method and is implemented as an iterative small-scale binary-integer linear programming procedure. It is well known that this problem is NP-hard, and the approximation factor achieved by the greedy algorithm is the theoretical limit for polynomial time. As for (non-polynomial time) exact algorithms that perform reasonably in practice, there has been very little in the literature although the problem is quite important for many applications. Our algorithm is guaranteed to ﬁnd the exact solution ﬁnitely many iterations, and it converges fast in practice due to the efﬁciency of the cutting-plane mechanism. Moreover, we also provide a method that produces successively decreasing upper-bounds of the optimal solution, while our algorithm provides successively increasing lower-bounds. Thus, the accuracy of the current solution can be estimated at any point, and the algorithm can be stopped early once a desired degree of tolerance is met. We evaluate our algorithm on sensor placement and feature selection applications showing good performance. 1

4 0.1905432 24 nips-2009-Adapting to the Shifting Intent of Search Queries

Author: Umar Syed, Aleksandrs Slivkins, Nina Mishra

Abstract: Search engines today present results that are often oblivious to recent shifts in intent. For example, the meaning of the query ‘independence day’ shifts in early July to a US holiday and to a movie around the time of the box ofﬁce release. While no studies exactly quantify the magnitude of intent-shifting trafﬁc, studies suggest that news events, seasonal topics, pop culture, etc account for 1/2 the search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as ﬁnd a result that is now more relevant. We present a meta-algorithm that marries a classiﬁer with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting trafﬁc increases. 1

5 0.1266979 220 nips-2009-Slow Learners are Fast

Author: Martin Zinkevich, John Langford, Alex J. Smola

Abstract: Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online learning. 1

6 0.121245 73 nips-2009-Dual Averaging Method for Regularized Stochastic Learning and Online Optimization

7 0.12040316 90 nips-2009-Factor Modeling for Advertisement Targeting

8 0.1035342 178 nips-2009-On Stochastic and Worst-case Models for Investing

9 0.09139552 122 nips-2009-Label Selection on Graphs

10 0.089099653 63 nips-2009-DUOL: A Double Updating Approach for Online Learning

11 0.084311247 52 nips-2009-Code-specific policy gradient rules for spiking neurons

12 0.078948654 250 nips-2009-Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference

13 0.068136126 125 nips-2009-Learning Brain Connectivity of Alzheimer's Disease from Neuroimaging Data

14 0.063539922 202 nips-2009-Regularized Distance Metric Learning:Theory and Algorithm

15 0.06136246 14 nips-2009-A Parameter-free Hedging Algorithm

16 0.057270359 242 nips-2009-The Infinite Partially Observable Markov Decision Process

17 0.056814693 152 nips-2009-Measuring model complexity with the prior predictive

18 0.055384584 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

19 0.053065833 142 nips-2009-Locality-sensitive binary codes from shift-invariant kernels

20 0.052132752 231 nips-2009-Statistical Models of Linear and Nonlinear Contextual Interactions in Early Visual Processing

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.162), (1, 0.135), (2, 0.131), (3, -0.156), (4, 0.089), (5, 0.168), (6, -0.155), (7, 0.004), (8, -0.067), (9, 0.037), (10, -0.178), (11, -0.02), (12, 0.066), (13, 0.094), (14, -0.115), (15, 0.382), (16, 0.082), (17, 0.045), (18, -0.091), (19, 0.031), (20, -0.168), (21, 0.053), (22, -0.027), (23, -0.133), (24, -0.057), (25, 0.021), (26, -0.018), (27, 0.022), (28, -0.033), (29, 0.09), (30, 0.058), (31, 0.08), (32, -0.019), (33, -0.003), (34, -0.013), (35, -0.052), (36, 0.028), (37, 0.004), (38, 0.028), (39, -0.003), (40, -0.109), (41, -0.099), (42, -0.014), (43, -0.011), (44, 0.021), (45, 0.044), (46, 0.011), (47, 0.059), (48, 0.057), (49, 0.008)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95687956 181 nips-2009-Online Learning of Assignments

Author: Matthew Streeter, Daniel Golovin, Andreas Krause

2 0.80411369 45 nips-2009-Beyond Convexity: Online Submodular Minimization

Author: Elad Hazan, Satyen Kale

3 0.72499174 239 nips-2009-Submodularity Cuts and Applications

Author: Yoshinobu Kawahara, Kiyohito Nagano, Koji Tsuda, Jeff A. Bilmes

4 0.58033693 24 nips-2009-Adapting to the Shifting Intent of Search Queries

Author: Umar Syed, Aleksandrs Slivkins, Nina Mishra

5 0.44391981 220 nips-2009-Slow Learners are Fast

Author: Martin Zinkevich, John Langford, Alex J. Smola

6 0.42529845 73 nips-2009-Dual Averaging Method for Regularized Stochastic Learning and Online Optimization

7 0.42289871 90 nips-2009-Factor Modeling for Advertisement Targeting

8 0.35436952 63 nips-2009-DUOL: A Double Updating Approach for Online Learning

9 0.34785143 14 nips-2009-A Parameter-free Hedging Algorithm

10 0.34549123 48 nips-2009-Bootstrapping from Game Tree Search

11 0.32404935 159 nips-2009-Multi-Step Dyna Planning for Policy Evaluation and Control

12 0.32326573 122 nips-2009-Label Selection on Graphs

13 0.31984156 178 nips-2009-On Stochastic and Worst-case Models for Investing

14 0.31438974 76 nips-2009-Efficient Learning using Forward-Backward Splitting

15 0.29888886 234 nips-2009-Streaming k-means approximation

16 0.24133936 7 nips-2009-A Data-Driven Approach to Modeling Choice

17 0.23691007 25 nips-2009-Adaptive Design Optimization in Experiments with People

18 0.2334758 233 nips-2009-Streaming Pointwise Mutual Information

19 0.22584435 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

20 0.22135085 171 nips-2009-Nonparametric Bayesian Models for Unsupervised Event Coreference Resolution

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(24, 0.531), (25, 0.042), (35, 0.035), (36, 0.052), (39, 0.041), (58, 0.055), (61, 0.024), (71, 0.038), (86, 0.062)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.95973432 144 nips-2009-Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness

Author: Garvesh Raskutti, Bin Yu, Martin J. Wainwright

Abstract: We study minimax rates for estimating high-dimensional nonparametric regression models with sparse additive structure and smoothness constraints. More precisely, our goal is to estimate a function f ∗ : Rp → R that has an additive decomposition of the form ∗ ∗ f ∗ (X1 , . . . , Xp ) = j∈S hj (Xj ), where each component function hj lies in some class H of “smooth” functions, and S ⊂ {1, . . . , p} is an unknown subset with cardinality s = |S|. Given n i.i.d. observations of f ∗ (X) corrupted with additive white Gaussian noise where the covariate vectors (X1 , X2 , X3 , ..., Xp ) are drawn with i.i.d. components from some distribution P, we determine lower bounds on the minimax rate for estimating the regression function with respect to squared-L2 (P) error. Our main result is a lower bound on the minimax rate that scales as max s log(p/s) , s ǫ2 (H) . The ﬁrst term reﬂects the sample size required for n n performing subset selection, and is independent of the function class H. The second term s ǫ2 (H) is an s-dimensional estimation term corresponding to the sample size required for n estimating a sum of s univariate functions, each chosen from the function class H. It depends linearly on the sparsity index s but is independent of the global dimension p. As a special case, if H corresponds to functions that are m-times differentiable (an mth -order Sobolev space), then the s-dimensional estimation term takes the form sǫ2 (H) ≍ s n−2m/(2m+1) . Either of n the two terms may be dominant in different regimes, depending on the relation between the sparsity and smoothness of the additive decomposition.

same-paper 2 0.94854891 181 nips-2009-Online Learning of Assignments

Author: Matthew Streeter, Daniel Golovin, Andreas Krause

3 0.92497385 240 nips-2009-Sufficient Conditions for Agnostic Active Learnable

Author: Liwei Wang

Abstract: We study pool-based active learning in the presence of noise, i.e. the agnostic setting. Previous works have shown that the effectiveness of agnostic active learning depends on the learning problem and the hypothesis space. Although there are many cases on which active learning is very useful, it is also easy to construct examples that no active learning algorithm can have advantage. In this paper, we propose intuitively reasonable sufﬁcient conditions under which agnostic active learning algorithm is strictly superior to passive supervised learning. We show that under some noise condition, if the Bayesian classiﬁcation boundary and the underlying distribution are smooth to a ﬁnite order, active learning achieves polynomial improvement in the label complexity; if the boundary and the distribution are inﬁnitely smooth, the improvement is exponential.

4 0.91176957 221 nips-2009-Solving Stochastic Games

Author: Liam M. Dermed, Charles L. Isbell

Abstract: Solving multi-agent reinforcement learning problems has proven difﬁcult because of the lack of tractable algorithms. We provide the ﬁrst approximation algorithm which solves stochastic games with cheap-talk to within absolute error of the optimal game-theoretic solution, in time polynomial in 1/ . Our algorithm extends Murray’s and Gordon’s (2007) modiﬁed Bellman equation which determines the set of all possible achievable utilities; this provides us a truly general framework for multi-agent learning. Further, we empirically validate our algorithm and ﬁnd the computational cost to be orders of magnitude less than what the theory predicts. 1

5 0.87869412 15 nips-2009-A Rate Distortion Approach for Semi-Supervised Conditional Random Fields

Author: Yang Wang, Gholamreza Haffari, Shaojun Wang, Greg Mori

Abstract: We propose a novel information theoretic approach for semi-supervised learning of conditional random ﬁelds that deﬁnes a training objective to combine the conditional likelihood on labeled data and the mutual information on unlabeled data. In contrast to previous minimum conditional entropy semi-supervised discriminative learning methods, our approach is grounded on a more solid foundation, the rate distortion theory in information theory. We analyze the tractability of the framework for structured prediction and present a convergent variational training algorithm to defy the combinatorial explosion of terms in the sum over label conﬁgurations. Our experimental results show the rate distortion approach outperforms standard l2 regularization, minimum conditional entropy regularization as well as maximum conditional entropy regularization on both multi-class classiﬁcation and sequence labeling problems. 1

6 0.75587362 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization

7 0.72312796 45 nips-2009-Beyond Convexity: Online Submodular Minimization

8 0.67880356 161 nips-2009-Nash Equilibria of Static Prediction Games

9 0.67077142 156 nips-2009-Monte Carlo Sampling for Regret Minimization in Extensive Games

10 0.64469194 12 nips-2009-A Generalized Natural Actor-Critic Algorithm

11 0.63283825 239 nips-2009-Submodularity Cuts and Applications

12 0.60929626 14 nips-2009-A Parameter-free Hedging Algorithm

13 0.6081751 91 nips-2009-Fast, smooth and adaptive regression in metric spaces

14 0.60696381 122 nips-2009-Label Selection on Graphs

15 0.60037124 178 nips-2009-On Stochastic and Worst-case Models for Investing

16 0.59827465 94 nips-2009-Fast Learning from Non-i.i.d. Observations

17 0.59477079 55 nips-2009-Compressed Least-Squares Regression

18 0.5806734 232 nips-2009-Strategy Grafting in Extensive Games

19 0.57773781 199 nips-2009-Ranking Measures and Loss Functions in Learning to Rank

20 0.57156914 147 nips-2009-Matrix Completion from Noisy Entries