nips nips2009 nips2009-24 knowledge-graph by maker-knowledge-mining

24 nips-2009-Adapting to the Shifting Intent of Search Queries


Source: pdf

Author: Umar Syed, Aleksandrs Slivkins, Nina Mishra

Abstract: Search engines today present results that are often oblivious to recent shifts in intent. For example, the meaning of the query ‘independence day’ shifts in early July to a US holiday and to a movie around the time of the box office release. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 1/2 the search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting traffic increases. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as find a result that is now more relevant. [sent-8, score-0.35]

2 We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. [sent-9, score-0.913]

3 We provide strong evidence that this regret is close to the best achievable. [sent-10, score-0.282]

4 More concretely, a query’s intent has shifted if the click distribution over search results at some time differs from the click distribution at a later time. [sent-18, score-0.372]

5 For the query ‘tomato’ on the heels of a tomato salmonella outbreak, the probability a user clicks on a news story describing the outbreak increases while the probability a user clicks on the Wikipedia entry for tomatoes rapidly decreases. [sent-19, score-0.398]

6 There are studies that suggest that queries likely to be intent-shifting — such as pop culture, news events, trends, and seasonal topics queries — constitute roughly half of the search queries that a search engine receives [10]. [sent-20, score-0.685]

7 Since traditional ranking features like PageRank [4] change slowly over time, and may be misleading if user intent has shifted very recently, we want to use just the observed click behavior of users to decide which search results to display. [sent-23, score-0.38]

8 † 1 There are many signals a search engine can use to detect when the intent of a query shifts. [sent-28, score-0.37]

9 Our algorithm is really a meta-algorithm that combines a bandit algorithm designed for the eventfree setting with an online classification algorithm. [sent-38, score-0.546]

10 The classifier uses the contexts to predict when events occur, and the bandit algorithm “starts over” on positive predictions. [sent-39, score-0.698]

11 The bandit algorithm provides feedback to the classifier by checking, soon after each of the classifier’s positive predictions, whether the optimal search result actually changed. [sent-40, score-0.628]

12 The key technical hurdle in proving a regret bound is handling events that happen during the “checking” phase. [sent-41, score-0.406]

13 This regret bound has a very weak dependence on T , which is highly desirable for search engines that receive much traffic. [sent-43, score-0.428]

14 Indeed, we show that √ any bandit algorithm that ignores context suffers regret Ω( T ), even when there is only one event. [sent-45, score-0.84]

15 Unlike many lower bounds for bandit problems, our lower bound holds even when ∆ is a constant independent of T . [sent-46, score-0.5]

16 For empirical evaluation, we ideally need access to the traffic of a real search engine so that search results can be adapted based on real-time click activity. [sent-48, score-0.344]

17 2 Problem Formulation and Preliminaries We view the problem of deciding which search results to display in response to user click behavior as a bandit problem, a well-known type of sequential decision problem. [sent-52, score-0.733]

18 A bandit algorithm A chooses it using only observed information from previous rounds, i. [sent-61, score-0.523]

19 The performance of an algorithm A is measured by T ∗ ∗ its regret: R(A) E t=1 pt (it ) − pt (it ) , where an optimal result it = arg maxi pt (i) is one with maximum click probability, and the expectation is taken over the randomness in the clicks and the internal randomization of the algorithm. [sent-64, score-0.404]

20 It is reasonable to assume that the number of events k ≪ T , since we believe that abrupt shifts in user intent are relatively rare. [sent-67, score-0.315]

21 Most existing bandit algorithms make no attempt to predict when events will occur, and consequently suffer regret √ Ω( T ). [sent-68, score-0.906]

22 On the other hand, a typical search engine receives many signals that can be used to predict events, such as bursts in query reformulation, average age of retrieved document, etc. [sent-69, score-0.36]

23 2 We assume that our bandit algorithm receives a context xt ∈ X at each round t, and that there exists a function f ∈ F, in some known concept class F, such that f (xt ) = +1 if an event occurs at round t, and f (xt ) = −1 otherwise. [sent-72, score-1.108]

24 At each round t, an eventful bandit algorithm must choose a result it using only observed information from previous rounds, i. [sent-74, score-0.779]

25 In order to develop an efficient eventful bandit algorithm, we make an additional key assumption: At least one optimal result before an event is significantly suboptimal after the event. [sent-77, score-0.74]

26 More precisely, we assume there exists a minimum shift ǫS > 0 such that, whenever an event occurs at round t, we have pt (i∗ ) < pt (i∗ ) − ǫS for at least one previously optimal search result i∗ . [sent-78, score-0.589]

27 Moreover, our bounds are parameterized by ∆ = mint mini=i∗ pt (i∗ ) − pt (i), the minimum suboptimality of any suboptimal result. [sent-80, score-0.296]

28 Online bandit algorithms (see [7] for background) have been considered in the context of ranking. [sent-82, score-0.535]

29 For instance, Radlinski et al [19] showed how to compose several instantiations of a bandit algorithm to produce a ranked list of search results. [sent-83, score-0.628]

30 Pandey et al [18] showed that bandit algorithms can be effective in serving advertisements to search engine users. [sent-84, score-0.683]

31 Even though existing bandit work does not address our problem, there are two key algorithms that we do use in our work. [sent-86, score-0.5]

32 The UCB 1 algorithm [2] assumes fixed click probabilities and has regret at n most O( ∆ log T ). [sent-87, score-0.399]

33 S algorithm [3] assumes that click probabilities can change on every round and has regret at most O(k nT log(nT )) for arbitrary pt ’s. [sent-89, score-0.638]

34 In a recent concurrent and independent work, Yu et al [22] studied bandit problems with “piecewisestationary” distributions, a notion that closely resembles our definition of events. [sent-102, score-0.519]

35 However, they make different assumptions than we do about the information a bandit algorithm can observe. [sent-103, score-0.523]

36 Expressed in the language of our problem setting, they assume that from time-to-time a bandit algorithm receives information about how users would have responded to search results that are never actually displayed. [sent-104, score-0.68]

37 The high-level idea is to use a bandit algorithm such as UCB 1, restart it every time the classifier predicts an event, and use subsequent rounds to generate feedback for the classifier. [sent-107, score-0.658]

38 3 each round, classifier inputs a context xt and outputs a “positive” or “negative” prediction of whether an event has happened in this round. [sent-110, score-0.446]

39 3 Since both classifier and bandit make predictions (about events and arms, respectively), for clarity we use the term “guess” exclusively to refer to predictions made by bandit, and reserve the term “prediction” for classifier. [sent-113, score-0.81]

40 Each adapting phase j ends as soon as classifier predicts “positive”; the round t when this happens is round tj+1 . [sent-119, score-0.609]

41 After each testing phase j, we generate j j a boolean prediction l of whether there was an event in the first round thereof. [sent-122, score-0.369]

42 Disregarding the interleaved testing phases for the moment, BWC restarts bandit whenever classifier predicts “positive”, optimistically assuming that the prediction is correct. [sent-127, score-0.802]

43 By our assumption that events cause some optimal arm to become significantly suboptimal (see Section 2), an incorrect prediction should result in G+ ∩ G− = ∅, where i is a phase before the putative event, j i and j is a phase after it. [sent-128, score-0.427]

44 First, classifier is safe for a given concept class: if it inputs only correctly labeled samples, it never outputs a false negative. [sent-155, score-0.468]

45 Second, bandit is (L, ǫ)-testable, in the following sense. [sent-156, score-0.5]

46 So an (L, ǫ)-testable 3 Following established convention, we call the options available to a bandit algorithm “arms”. [sent-159, score-0.523]

47 4 bandit algorithm is one that, after L rounds, has a good guess of which arms are optimal and which are at least ǫ-suboptimal. [sent-161, score-0.665]

48 For correctness, we require bandit to be (L, ǫS )-testable, where ǫS is the minimum shift. [sent-162, score-0.528]

49 The performance of bandit is quantified via its event-free regret, i. [sent-163, score-0.5]

50 We assume that the state of classifier is updated only if it receives a labeled sample, and consider a game in which in each round t, classifier receives a context xt ∈ S, outputs a (false) positive, and receives a (correctly) labeled sample (x, false). [sent-167, score-0.81]

51 For a given context set X and a given concept class F, let the FP-complexity of the classifier be the maximal possible number of rounds in such a game, where the maximum is taken over all event oracles f ∈ F and all possible sequences {xt }. [sent-168, score-0.298]

52 We will discuss efficient implementations of a safe classifier and a (L, ǫ)-testable bandit in Sections 5 and Section 6, respectively. [sent-170, score-0.828]

53 The main technical difficulty in the analysis is that the correct operation of the components of BWC — classifier and bandit — is interdependent. [sent-172, score-0.686]

54 In particular, one challenge is to handle events that occur during the first L rounds of a phase; these events may potentially “contaminate” the L-th round guesses and cause incorrect feedback to classifier. [sent-173, score-0.512]

55 Consider an instance of the eventful bandit problem with number of rounds T , n arms, k events and minimum shift ǫS . [sent-175, score-0.917]

56 Consider algorithm BWC with parameter L and components classifier and bandit such that for this problem instance, classifier is safe, and bandit is (L, ǫS )-testable. [sent-176, score-1.395]

57 If any two events are at least 2L rounds apart, then the regret of BWC is R(T ) ≤ (2k + d) R0 (T ) + (k + d) R0 (L) + kL. [sent-177, score-0.514]

58 (1) where d is the FP-complexity of the classifier and R0 (·) is the event-free regret of bandit. [sent-178, score-0.282]

59 In the +kL term in (1), the k can be replaced by the number of testing phases S that contain both a false positive in round 1 of the phase and an actual event later in the phase; this number can potentially be much smaller than k. [sent-182, score-0.477]

60 By using SafeCl as our classifier, we introduce dF into the regret bound of bwc, and this quantity can be large. [sent-205, score-0.282]

61 However, in Section 7 we show that the regret of any algorithm must depend on dF , unless it depends strongly on the number of rounds T . [sent-206, score-0.413]

62 5 6 Testable Bandit Algorithms In this section we will consider the stochastic n-armed bandit problem. [sent-207, score-0.5]

63 However, in the first L rounds this algorithm incurs regret of Ω(L), which is very suboptimal. [sent-213, score-0.413]

64 For instance, for UCB 1 the regret would be √ n R(L) ≤ O(min( ∆ log L, nL log L)). [sent-214, score-0.282]

65 In this section, we develop an algorithm which has the same regret bound as UCB 1, and is (L, ǫ)testable. [sent-215, score-0.305]

66 For round t, let µt (u) be the sample average of arm u, and let nt (u) be the number of times arm u has been played. [sent-221, score-0.398]

67 Recall that in each round t algorithm UCB 1 chooses an arm u with the highest index It (u) = µt (u) + rt (u), where rt (u) = 8 log(t)/nt (u) is a term that we’ll call the confidence radius whose meaning is that |p(u) − µt (u)| ≤ rt (u) with high probability. [sent-223, score-0.387]

68 7 Upper and Lower Bounds Plugging the classifier from Section 5 and the bandit algorithm from Section 6 into the metaalgorithm from Section 4, we obtain the following numerical guarantee. [sent-243, score-0.523]

69 Consider an instance S of the eventful bandit problem with with number of rounds T , n arms and k events, minimum shift ǫS , minimum suboptimality ∆, and concept class diameter dF . [sent-245, score-1.078]

70 2 S Consider the BWC algorithm with parameter L and components classifier and bandit as presented, respectively, in Section 5 and Section 6. [sent-247, score-0.709]

71 Then the regret of BWC is R(T ) ≤ n (3k + 2dF ) ∆ + k ǫn 2 (log T ). [sent-248, score-0.282]

72 S 6 While the linear dependence on n in this bound may seem large, note that without additional assumptions, regret must be linear in n, since each arm must be pulled at least once. [sent-249, score-0.435]

73 We now state two lower bounds about eventful bandit problems; the proofs are in the full version [20]. [sent-251, score-0.6]

74 Theorem 4 shows that in order to achieve regret that is logarithmic in the number of rounds, a context-aware algorithm is necessary, assuming there is at least one event. [sent-252, score-0.329]

75 Incidentally, this lowerbound can be easily extended to prove that, in our model, no algorithm can achieve logarithmic regret when an event oracle f is not contained in the concept class F. [sent-253, score-0.512]

76 Consider the eventful bandit problem with number of rounds T , two arms, minimum 1 shift ǫS and minimum suboptimality ∆, where ǫS = ∆ = ǫ, for an arbitrary ǫ ∈ (0, 2 ). [sent-255, score-0.879]

77 For any context-ignoring bandit algorithm A, there exists a problem instance with a single event such that √ regret RA (T ) ≥ Ω(ǫ T ). [sent-256, score-0.939]

78 If we desire a regret bound that has logarithmic dependence on the number of rounds, then a linear dependence on k + dF is necessary. [sent-258, score-0.37]

79 Consider the eventful bandit problem with number of rounds T and concept class diameter dF . [sent-260, score-0.785]

80 4 ǫ Moreover, there exists a problem instance with two arms, a single event, event threshold Θ(1) and minimum suboptimality Θ(1) such that regret RA (T ) ≥ Ω(max(T 1/3 , dF )) log T . [sent-263, score-0.521]

81 While our theoretical results suggest that regret grows with the number of features in the context space, in our experiments, we surprisingly find that BWC is robust to higher dimensional feature spaces. [sent-269, score-0.317]

82 Due to the random nature with which data is generated, regret is reported as an average over 10 runs. [sent-277, score-0.282]

83 Thus, if there are two features, say query volume and query abandonment rate, an event occurs if and only if both the volume and abandonment rate exceed certain thresholds. [sent-279, score-0.407]

84 Bandit with Classifier (BWC): Figure 1(a) shows the average cumulative regret over time of three algorithms. [sent-280, score-0.307]

85 In addition, we compare to an algorithm we call ORA, which uses the event oracle to reset UCB 1 whenever an event occurs. [sent-282, score-0.281]

86 BWC’s safe classifier makes many mistakes in the beginning and consequently pays the price of believing that each query is experiencing an event when in fact it is not. [sent-286, score-0.365]

87 3 3 4 x 10 Figure 1: (a) (Left) BWC’s cumulative regret compared to UCB 1 and ORA (UCB 1 with an oracle indicating the exact locations of the intent-shifting event) (b) (Right, Top Table) Final regret (in thousands) as the fraction of intent-shifting queries varies. [sent-333, score-0.741]

88 (c) (Right, Bottom Table) Final regret (in thousands) as the number of features grows. [sent-335, score-0.282]

89 UCB 1 alone ignores the context entirely and thus incurs substantially larger cumulative regret by the end. [sent-337, score-0.342]

90 If there are no intent-shifting queries, then UCB 1’s regret is the best. [sent-340, score-0.282]

91 On the other hand, BWC’s regret dominates the other approaches, especially as the fraction of intent-shifting queries grows. [sent-342, score-0.406]

92 With more intent-shifting queries, the expectation is that regret monotonically increases. [sent-348, score-0.282]

93 There is however a decrease in regret going from 1/4 to 3/8 intent-shifting queries. [sent-350, score-0.282]

94 We believe that this is due to the fact that each query has at most 10 intentshifting events spread uniformly and it is possible that there were fewer events with potentially smaller shifts in intent in those runs. [sent-351, score-0.532]

95 In other words, the standard deviation of the regret is large. [sent-352, score-0.282]

96 In Figure 1(b), we show the cumulative regret after 3M impressions as the dimensionality of the context vector grows from 10 to 40 features. [sent-358, score-0.38]

97 BWC’s regret is consistently close to ORA as the number of features grows. [sent-359, score-0.282]

98 On the other hand, UCB 1’s regret though competitive is worse than BWC, while EXP 3. [sent-360, score-0.282]

99 S’s regret is completely independent of the number of features. [sent-363, score-0.282]

100 The standard deviation of the regret over the 10 runs is substantially lower than the previous experiment. [sent-364, score-0.282]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('bandit', 0.5), ('bwc', 0.4), ('ucb', 0.363), ('regret', 0.282), ('classifier', 0.186), ('round', 0.156), ('safe', 0.142), ('events', 0.124), ('arm', 0.121), ('event', 0.115), ('query', 0.108), ('rounds', 0.108), ('queries', 0.104), ('arms', 0.103), ('eventful', 0.1), ('ora', 0.1), ('intent', 0.098), ('click', 0.094), ('search', 0.086), ('pt', 0.083), ('df', 0.083), ('engine', 0.078), ('suboptimality', 0.077), ('safecl', 0.075), ('sf', 0.068), ('er', 0.068), ('ltj', 0.063), ('phase', 0.059), ('false', 0.058), ('traf', 0.055), ('tj', 0.053), ('user', 0.053), ('phases', 0.05), ('receives', 0.05), ('news', 0.048), ('bandits', 0.044), ('concept', 0.04), ('shifts', 0.04), ('guess', 0.039), ('shift', 0.038), ('clicks', 0.038), ('abandonment', 0.038), ('bursts', 0.038), ('deepak', 0.038), ('happened', 0.038), ('impressions', 0.038), ('intentshifting', 0.038), ('slivkins', 0.038), ('tomato', 0.038), ('classi', 0.037), ('diameter', 0.037), ('context', 0.035), ('xt', 0.033), ('pandey', 0.033), ('live', 0.033), ('dependence', 0.032), ('contexts', 0.032), ('rt', 0.029), ('engines', 0.028), ('oracle', 0.028), ('ranking', 0.028), ('minimum', 0.028), ('predicts', 0.027), ('kleinberg', 0.027), ('robert', 0.026), ('gt', 0.026), ('suboptimal', 0.025), ('cumulative', 0.025), ('aleksandrs', 0.025), ('deepayan', 0.025), ('lasts', 0.025), ('sandeep', 0.025), ('seasonal', 0.025), ('syed', 0.025), ('umar', 0.025), ('xtj', 0.025), ('adapting', 0.025), ('logarithmic', 0.024), ('nicol', 0.024), ('algorithm', 0.023), ('labeled', 0.022), ('played', 0.022), ('shifting', 0.022), ('ra', 0.022), ('chakrabarti', 0.022), ('culture', 0.022), ('multiarmed', 0.022), ('nina', 0.022), ('outbreak', 0.022), ('users', 0.021), ('agarwal', 0.021), ('fraction', 0.02), ('testable', 0.02), ('putative', 0.02), ('testing', 0.02), ('outputs', 0.02), ('positive', 0.019), ('prediction', 0.019), ('instance', 0.019), ('al', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000012 24 nips-2009-Adapting to the Shifting Intent of Search Queries

Author: Umar Syed, Aleksandrs Slivkins, Nina Mishra

Abstract: Search engines today present results that are often oblivious to recent shifts in intent. For example, the meaning of the query ‘independence day’ shifts in early July to a US holiday and to a movie around the time of the box office release. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 1/2 the search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting traffic increases. 1

2 0.19302544 45 nips-2009-Beyond Convexity: Online Submodular Minimization

Author: Elad Hazan, Satyen Kale

Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. 1

3 0.1905432 181 nips-2009-Online Learning of Assignments

Author: Matthew Streeter, Daniel Golovin, Andreas Krause

Abstract: Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. We present an efficient algorithm for this general problem and analyze it in the no-regret model. Our algorithm possesses strong theoretical guarantees, such as a performance ratio that converges to the optimal constant of 1 − 1/e. We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades. 1

4 0.16462985 14 nips-2009-A Parameter-free Hedging Algorithm

Author: Kamalika Chaudhuri, Yoav Freund, Daniel J. Hsu

Abstract: We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret. 1

5 0.14447245 178 nips-2009-On Stochastic and Worst-case Models for Investing

Author: Elad Hazan, Satyen Kale

Abstract: In practice, most investing is done assuming a probabilistic model of stock price returns known as the Geometric Brownian Motion (GBM). While often an acceptable approximation, the GBM model is not always valid empirically. This motivates a worst-case approach to investing, called universal portfolio management, where the objective is to maximize wealth relative to the wealth earned by the best fixed portfolio in hindsight. In this paper we tie the two approaches, and design an investment strategy which is universal in the worst-case, and yet capable of exploiting the mostly valid GBM model. Our method is based on new and improved regret bounds for online convex optimization with exp-concave loss functions. 1

6 0.093368597 156 nips-2009-Monte Carlo Sampling for Regret Minimization in Extensive Games

7 0.08339718 90 nips-2009-Factor Modeling for Advertisement Targeting

8 0.074481875 171 nips-2009-Nonparametric Bayesian Models for Unsupervised Event Coreference Resolution

9 0.06986846 220 nips-2009-Slow Learners are Fast

10 0.069730692 190 nips-2009-Polynomial Semantic Indexing

11 0.068006769 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization

12 0.064088948 166 nips-2009-Noisy Generalized Binary Search

13 0.059444096 74 nips-2009-Efficient Bregman Range Search

14 0.057068553 17 nips-2009-A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds

15 0.056534402 139 nips-2009-Linear-time Algorithms for Pairwise Statistical Problems

16 0.050773729 52 nips-2009-Code-specific policy gradient rules for spiking neurons

17 0.047645997 198 nips-2009-Rank-Approximate Nearest Neighbor Search: Retaining Meaning and Speed in High Dimensions

18 0.047334883 157 nips-2009-Multi-Label Prediction via Compressed Sensing

19 0.046890479 64 nips-2009-Data-driven calibration of linear estimators with minimal penalties

20 0.045012426 87 nips-2009-Exponential Family Graph Matching and Ranking


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.13), (1, 0.114), (2, 0.086), (3, -0.119), (4, 0.098), (5, 0.139), (6, -0.157), (7, -0.009), (8, -0.014), (9, 0.069), (10, -0.114), (11, -0.024), (12, 0.013), (13, 0.072), (14, -0.029), (15, 0.169), (16, 0.061), (17, 0.142), (18, -0.021), (19, 0.023), (20, -0.027), (21, 0.051), (22, -0.041), (23, -0.073), (24, 0.002), (25, 0.023), (26, 0.043), (27, -0.01), (28, 0.161), (29, -0.032), (30, -0.132), (31, -0.017), (32, 0.034), (33, -0.106), (34, -0.121), (35, -0.016), (36, -0.009), (37, -0.013), (38, -0.052), (39, -0.043), (40, -0.014), (41, 0.02), (42, 0.059), (43, -0.16), (44, 0.069), (45, -0.115), (46, 0.063), (47, 0.044), (48, 0.037), (49, -0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94671237 24 nips-2009-Adapting to the Shifting Intent of Search Queries

Author: Umar Syed, Aleksandrs Slivkins, Nina Mishra

Abstract: Search engines today present results that are often oblivious to recent shifts in intent. For example, the meaning of the query ‘independence day’ shifts in early July to a US holiday and to a movie around the time of the box office release. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 1/2 the search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting traffic increases. 1

2 0.65725136 14 nips-2009-A Parameter-free Hedging Algorithm

Author: Kamalika Chaudhuri, Yoav Freund, Daniel J. Hsu

Abstract: We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret. 1

3 0.61321324 45 nips-2009-Beyond Convexity: Online Submodular Minimization

Author: Elad Hazan, Satyen Kale

Abstract: We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings. 1

4 0.61229235 181 nips-2009-Online Learning of Assignments

Author: Matthew Streeter, Daniel Golovin, Andreas Krause

Abstract: Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. We present an efficient algorithm for this general problem and analyze it in the no-regret model. Our algorithm possesses strong theoretical guarantees, such as a performance ratio that converges to the optimal constant of 1 − 1/e. We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades. 1

5 0.49540311 178 nips-2009-On Stochastic and Worst-case Models for Investing

Author: Elad Hazan, Satyen Kale

Abstract: In practice, most investing is done assuming a probabilistic model of stock price returns known as the Geometric Brownian Motion (GBM). While often an acceptable approximation, the GBM model is not always valid empirically. This motivates a worst-case approach to investing, called universal portfolio management, where the objective is to maximize wealth relative to the wealth earned by the best fixed portfolio in hindsight. In this paper we tie the two approaches, and design an investment strategy which is universal in the worst-case, and yet capable of exploiting the mostly valid GBM model. Our method is based on new and improved regret bounds for online convex optimization with exp-concave loss functions. 1

6 0.45984536 90 nips-2009-Factor Modeling for Advertisement Targeting

7 0.42100608 156 nips-2009-Monte Carlo Sampling for Regret Minimization in Extensive Games

8 0.41033328 220 nips-2009-Slow Learners are Fast

9 0.38915312 193 nips-2009-Potential-Based Agnostic Boosting

10 0.34728515 74 nips-2009-Efficient Bregman Range Search

11 0.32259536 198 nips-2009-Rank-Approximate Nearest Neighbor Search: Retaining Meaning and Speed in High Dimensions

12 0.3216548 190 nips-2009-Polynomial Semantic Indexing

13 0.3142435 233 nips-2009-Streaming Pointwise Mutual Information

14 0.31262466 73 nips-2009-Dual Averaging Method for Regularized Stochastic Learning and Online Optimization

15 0.30130795 139 nips-2009-Linear-time Algorithms for Pairwise Statistical Problems

16 0.29824331 171 nips-2009-Nonparametric Bayesian Models for Unsupervised Event Coreference Resolution

17 0.29793087 63 nips-2009-DUOL: A Double Updating Approach for Online Learning

18 0.28614765 76 nips-2009-Efficient Learning using Forward-Backward Splitting

19 0.26204181 48 nips-2009-Bootstrapping from Game Tree Search

20 0.26087677 234 nips-2009-Streaming k-means approximation


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(21, 0.013), (24, 0.096), (25, 0.041), (35, 0.052), (36, 0.061), (39, 0.049), (55, 0.011), (58, 0.052), (61, 0.018), (70, 0.298), (71, 0.069), (81, 0.02), (84, 0.015), (86, 0.085), (91, 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78090006 24 nips-2009-Adapting to the Shifting Intent of Search Queries

Author: Umar Syed, Aleksandrs Slivkins, Nina Mishra

Abstract: Search engines today present results that are often oblivious to recent shifts in intent. For example, the meaning of the query ‘independence day’ shifts in early July to a US holiday and to a movie around the time of the box office release. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 1/2 the search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting traffic increases. 1

2 0.62238884 180 nips-2009-On the Convergence of the Concave-Convex Procedure

Author: Gert R. Lanckriet, Bharath K. Sriperumbudur

Abstract: The concave-convex procedure (CCCP) is a majorization-minimization algorithm that solves d.c. (difference of convex functions) programs as a sequence of convex programs. In machine learning, CCCP is extensively used in many learning algorithms like sparse support vector machines (SVMs), transductive SVMs, sparse principal component analysis, etc. Though widely used in many applications, the convergence behavior of CCCP has not gotten a lot of specific attention. Yuille and Rangarajan analyzed its convergence in their original paper, however, we believe the analysis is not complete. Although the convergence of CCCP can be derived from the convergence of the d.c. algorithm (DCA), its proof is more specialized and technical than actually required for the specific case of CCCP. In this paper, we follow a different reasoning and show how Zangwill’s global convergence theory of iterative algorithms provides a natural framework to prove the convergence of CCCP, allowing a more elegant and simple proof. This underlines Zangwill’s theory as a powerful and general framework to deal with the convergence issues of iterative algorithms, after also being used to prove the convergence of algorithms like expectation-maximization, generalized alternating minimization, etc. In this paper, we provide a rigorous analysis of the convergence of CCCP by addressing these questions: (i) When does CCCP find a local minimum or a stationary point of the d.c. program under consideration? (ii) When does the sequence generated by CCCP converge? We also present an open problem on the issue of local convergence of CCCP.

3 0.49582314 250 nips-2009-Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference

Author: Khashayar Rohanimanesh, Sameer Singh, Andrew McCallum, Michael J. Black

Abstract: Large, relational factor graphs with structure defined by first-order logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blow-up, solutions are often derived from MCMC. However, because of limitations in the design and parameterization of the jump function, these samplingbased methods suffer from local minima—the system must transition through lower-scoring configurations before arriving at a better MAP solution. This paper presents a new method of explicitly selecting fruitful downward jumps by leveraging reinforcement learning (RL). Rather than setting parameters to maximize the likelihood of the training data, parameters of the factor graph are treated as a log-linear function approximator and learned with methods of temporal difference (TD); MAP inference is performed by executing the resulting policy on held out test data. Our method allows efficient gradient updates since only factors in the neighborhood of variables affected by an action need to be computed—we bypass the need to compute marginals entirely. Our method yields dramatic empirical success, producing new state-of-the-art results on a complex joint model of ontology alignment, with a 48% reduction in error over state-of-the-art in that domain. 1

4 0.49429178 3 nips-2009-AUC optimization and the two-sample problem

Author: Nicolas Vayatis, Marine Depecker, Stéphan J. Clémençcon

Abstract: The purpose of the paper is to explore the connection between multivariate homogeneity tests and AUC optimization. The latter problem has recently received much attention in the statistical learning literature. From the elementary observation that, in the two-sample problem setup, the null assumption corresponds to the situation where the area under the optimal ROC curve is equal to 1/2, we propose a two-stage testing method based on data splitting. A nearly optimal scoring function in the AUC sense is first learnt from one of the two half-samples. Data from the remaining half-sample are then projected onto the real line and eventually ranked according to the scoring function computed at the first stage. The last step amounts to performing a standard Mann-Whitney Wilcoxon test in the onedimensional framework. We show that the learning step of the procedure does not affect the consistency of the test as well as its properties in terms of power, provided the ranking produced is accurate enough in the AUC sense. The results of a numerical experiment are eventually displayed in order to show the efficiency of the method. 1

5 0.49162745 90 nips-2009-Factor Modeling for Advertisement Targeting

Author: Ye Chen, Michael Kapralov, John Canny, Dmitry Y. Pavlov

Abstract: We adapt a probabilistic latent variable model, namely GaP (Gamma-Poisson) [6], to ad targeting in the contexts of sponsored search (SS) and behaviorally targeted (BT) display advertising. We also approach the important problem of ad positional bias by formulating a one-latent-dimension GaP factorization. Learning from click-through data is intrinsically large scale, even more so for ads. We scale up the algorithm to terabytes of real-world SS and BT data that contains hundreds of millions of users and hundreds of thousands of features, by leveraging the scalability characteristics of the algorithm and the inherent structure of the problem including data sparsity and locality. SpeciÄ?Ĺš cally, we demonstrate two somewhat orthogonal philosophies of scaling algorithms to large-scale problems, through the SS and BT implementations, respectively. Finally, we report the experimental results using Yahoo’s vast datasets, and show that our approach substantially outperform the state-of-the-art methods in prediction accuracy. For BT in particular, the ROC area achieved by GaP is exceeding 0.95, while one prior approach using Poisson regression [11] yielded 0.83. For computational performance, we compare a single-node sparse implementation with a parallel implementation using Hadoop MapReduce, the results are counterintuitive yet quite interesting. We therefore provide insights into the underlying principles of large-scale learning. 1

6 0.4896118 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization

7 0.48833147 260 nips-2009-Zero-shot Learning with Semantic Output Codes

8 0.48726714 59 nips-2009-Construction of Nonparametric Bayesian Models from Parametric Bayes Equations

9 0.48472381 169 nips-2009-Nonlinear Learning using Local Coordinate Coding

10 0.4840658 16 nips-2009-A Smoothed Approximate Linear Program

11 0.48362109 12 nips-2009-A Generalized Natural Actor-Critic Algorithm

12 0.48318106 112 nips-2009-Human Rademacher Complexity

13 0.48268265 204 nips-2009-Replicated Softmax: an Undirected Topic Model

14 0.48259425 132 nips-2009-Learning in Markov Random Fields using Tempered Transitions

15 0.48135847 199 nips-2009-Ranking Measures and Loss Functions in Learning to Rank

16 0.48125693 9 nips-2009-A Game-Theoretic Approach to Hypergraph Clustering

17 0.48120797 136 nips-2009-Learning to Rank by Optimizing NDCG Measure

18 0.48091197 41 nips-2009-Bayesian Source Localization with the Multivariate Laplace Prior

19 0.48053199 111 nips-2009-Hierarchical Modeling of Local Image Features through $L p$-Nested Symmetric Distributions

20 0.48033524 255 nips-2009-Variational Inference for the Nested Chinese Restaurant Process