jmlr jmlr2005 jmlr2005-12 knowledge-graph by maker-knowledge-mining

12 jmlr-2005-An MDP-Based Recommender System


Source: pdf

Author: Guy Shani, David Heckerman, Ronen I. Brafman

Abstract: Typical recommender systems adopt a static view of the recommendation process and treat it as a prediction problem. We argue that it is more appropriate to view the problem of generating recommendations as a sequential optimization problem and, consequently, that Markov decision processes (MDPs) provide a more appropriate model for recommender systems. MDPs introduce two benefits: they take into account the long-term effects of each recommendation and the expected value of each recommendation. To succeed in practice, an MDP-based recommender system must employ a strong initial model, must be solvable quickly, and should not consume too much memory. In this paper, we describe our particular MDP model, its initialization using a predictive model, the solution and update algorithm, and its actual performance on a commercial site. We also describe the particular predictive model we used which outperforms previous models. Our system is one of a small number of commercially deployed recommender systems. As far as we know, it is the first to report experimental analysis conducted on a real commercial site. These results validate the commercial value of recommender systems, and in particular, of our MDP-based approach. Keywords: recommender systems, Markov decision processes, learning, commercial applications

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 IL Computer Science Department Ben-Gurion University Beer-Sheva, Israel 84105 Editor: Craig Boutilier Abstract Typical recommender systems adopt a static view of the recommendation process and treat it as a prediction problem. [sent-9, score-0.713]

2 We argue that it is more appropriate to view the problem of generating recommendations as a sequential optimization problem and, consequently, that Markov decision processes (MDPs) provide a more appropriate model for recommender systems. [sent-10, score-0.858]

3 To alleviate this problem, many web sites attempt to help users by incorporating a recommender system (Resnick and Varian, 1997) that provides users with a list of items and/or webpages that are likely to interest them. [sent-20, score-1.231]

4 Once the user makes her choice, a new list of recommended items is presented. [sent-21, score-0.679]

5 In particular, an optimal policy will take into account the likelihood of a recommendation to be accepted by the user, the immediate value to the site of such an acceptance, and the long-term implications of this on the user’s future choices. [sent-38, score-0.574]

6 The approach we suggest initializes a predictive model of user behavior using data gathered on the site prior to the implementation of the recommender system. [sent-52, score-1.013]

7 The predictive model we describe is motivated by our sequential view of the recommendation process, but constitutes an independent contribution. [sent-56, score-0.537]

8 That is, the system is trained using historical data from sites that do not provide recommendations, and tested to see whether the recommendations conform to actual user behavior. [sent-65, score-0.627]

9 However, predictive accuracy is not an ideal measure, as it does not test how user behavior is influenced by the system’s suggestions or what percentage of recommendations are accepted by users. [sent-67, score-0.709]

10 These results, which to the best of our knowledge are among the first reports of online performance in a commercial site, are reported in Section 6, providing very encouraging validation to recommender systems in general, and to our sequential optimization approach in particular. [sent-75, score-0.562]

11 In Section 5 we explain how we use this predictive model as a basis for a more sophisticated MDP-based model for the recommender system. [sent-89, score-0.634]

12 Many web sites presenting a wide variety of content (such as articles, news stories, or items to purchase) discovered that users had difficulties finding the items that interested them out of the total selection. [sent-96, score-0.898]

13 Recommender Systems (Resnick and Varian, 1997) help users limit their search by supplying a list of items that might interest a specific user. [sent-97, score-0.535]

14 Different approaches were suggested for supplying meaningful recommendations to users and some were implemented in modern sites (Schafer et al. [sent-98, score-0.528]

15 Approaches originating from the field of information retrieval (IR) rely on the content of the items (such as description, category, title, author) and therefore are known as content-based recommendations (Mooney and Roy, 2000). [sent-103, score-0.677]

16 Based on this score, a list of items similar to the ones the user previously selected can be supplied. [sent-105, score-0.6]

17 Knowledge-based recommender systems (Burke, 2000) go one step farther by using deeper knowledge about the user and the domain. [sent-106, score-0.645]

18 2 Collaborative Filtering The collaborative filtering approach originates in human behavior: people searching for an interesting item they know little of, such as a movie to rent at the video store, tend to rely on friends to recommend items they tried and liked. [sent-114, score-0.657]

19 Over the net however, a larger community that can recommend items to our user is available, but the persons in this large community know little or nothing about each other. [sent-116, score-0.605]

20 Conceptually, the goal of a collaborative filtering engine is to identify those users whose taste in items is predictive of the taste of a certain person (usually called a neighborhood), and use their recommendations to construct a list of items interesting for her. [sent-117, score-1.43]

21 A binary grading method is used when a value of 1 is given to items the user has bought and 0 to other items. [sent-125, score-0.638]

22 Given the selections of a given user, a memory-based system identifies similar users and makes recommendations based on the items selected by these users. [sent-139, score-0.869]

23 3 The Sequential Nature of the Recommendation Process Most recommender systems work in a sequential manner: they suggest items to the user who can then accept one of the recommendations. [sent-146, score-1.05]

24 This sequential nature of the recommendation process, where at each stage a new list is calculated based on the user’s past ratings, will lead us naturally to our reformulation of the recommendation process as a sequential optimization process. [sent-148, score-0.773]

25 Namely, optimal recommendations may depend not only on previous items pruchased, but also on the order in which those items are purchased. [sent-150, score-1.016]

26 In the context of recommender systems, if we equate actions with recommendations, then an MDP can be used to model user behavior with recommendations – as we show below – whereas an MC can be used to model user behavior without recommendations. [sent-216, score-1.296]

27 The Predictive Model Our first step is to construct a predictive model of user purchases, that is, a model that can predict what item the user will buy next. [sent-227, score-0.937]

28 In Section 4 we shall 1271 S HANI , B RAFMAN AND H ECKERMAN show that our predictive model outperforms previous models, and in Section 5 we shall intialize our MDP-based recommender system using this predictive model. [sent-230, score-0.784]

29 Evaluation of the Predictive Model Before incorporating our predictive model into an MDP-based recommender system, we evaluated the accuracy of the predictive model. [sent-316, score-0.75]

30 Two data sets were used: one containing user transactions (purchases) and one containing user browsing paths obtained from web logs. [sent-324, score-0.52]

31 We filtered out items that were bought/visited less than 100 times and users who bought/browsed no more than one item as is commonly done when evaluating predictive models (for example, Zimdars et al. [sent-325, score-0.872]

32 We were left with 116 items and 10820 users in the transactions data set, and 65 items and 6678 users in the browsing data set. [sent-327, score-1.038]

33 1 R ECOMMENDATION S CORE For this measure of accuracy, a recommendation is deemed successful if the observed item ti is among the top m recommended items (m is varied in the experiments). [sent-355, score-0.936]

34 This score is meaningful for commerce sites that require a short list of recommendations and therefore care little about the ordering of the items in the list. [sent-358, score-0.802]

35 There are more items and users in the transaction data set since we used transactions over one year, whereas browsing data was collected only during one week. [sent-360, score-0.562]

36 2 E XPONENTIAL D ECAY S CORE This measure of accuracy is based on the position of the observed ti on the recommendation list, thus evaluating not only the content of the list but also the order of items in it. [sent-363, score-0.69]

37 In particular, it is assumed that a user will actually see the mth item in the list with probability p(m) = 2−(m−1)/(α−1) , (m ≥ 1) (11) where α is the half-life parameter—the index of the item in the list with probability 0. [sent-365, score-0.709]

38 ,ti−1 is a case, and pos(ti |c) is the position of the observed item ti in the list of recommended items for c. [sent-370, score-0.683]

39 In the non-sequential approach, for every item, a decision tree is built that predicts whether the item will be selected based on whether the remaining items were or were not selected. [sent-385, score-0.544]

40 In the sequential approach, for every item, a decision tree is built that predicts whether the item will be selected next, based on the previous k items that were selected. [sent-386, score-0.61]

41 In either 1278 A N MDP-BASED R ECOMMENDER S YSTEM case, once recommendations are available in the site (thus changing the site structure), skipping may prove beneficial. [sent-432, score-0.775]

42 We assume that we are given a set of cases describing user behavior within a site that does not provide recommendations, as well as a probabilistic predictive model of a user acting without recommendations generated from this data. [sent-437, score-1.126]

43 The predictive model provides the probability the user will purchase a particular item x given that her sequence of past purchases is x1 , . [sent-439, score-0.73]

44 The states of the MDP for our recommender system are k-tuples of items (for example, books, CDs), some prefix of which may contain null values corresponding to missing items. [sent-454, score-0.839]

45 Because the state encodes the list of items purchased, the reward depends on the last item defining the current state only. [sent-459, score-0.745]

46 For example, the reward for state x1 , x2 , x3 is the reward generated by the site from the sale of item x3 . [sent-460, score-0.537]

47 When we recommend an item x , the user has three options: • Accept this recommendation, thus transferring from state x1 , x2 , x3 into x2 , x3 , x • Select some non-recommended item x , thus transferring the state x1 , x2 , x3 into x2 , x3 , x . [sent-463, score-0.78]

48 The transition function for the MDP model: 1 trMDP ( x1 , x2 , x3 , x , x2 , x3 , x ) (13) 1279 S HANI , B RAFMAN AND H ECKERMAN is the probability that the user will select item x given that item x is recommended in state 1 x1 , x2 , x3 . [sent-466, score-0.863]

49 We write trMDP to denote that only single item recommendations are used. [sent-467, score-0.543]

50 A for-profit e-commerce8 site is unlikely to use a recommender system that generates irrelevant recommendations for a long period, while waiting for it to converge to an optimal policy. [sent-472, score-0.956]

51 We can do so based on any good predictive model, making the following assumptions: • A recommendation increases the probability that a user will buy an item. [sent-474, score-0.736]

52 • The probability that a user will buy an item that was not recommended is lower than the probability that she will buy when the system issues no recommendations at all, but still proportional to it. [sent-478, score-1.027]

53 We use tr predict (s, s · r) to denote the probability that the user will choose r next, given that its current state is s according to the predictive model in which recommendations are not considered, that is, Pr pred (r|s). [sent-487, score-0.852]

54 Actually CF models do not refer to the presence of recommendations, but using such systems to generate recommendations to users in commercial applications has the underlying assumption that the recommendation will increase the likelihood that a user will purchase an item. [sent-493, score-1.163]

55 (2000), who showed that the increase in the probability of following a recommendation is large when one recommends items having high lift, defined to be pr(x|h) . [sent-498, score-0.668]

56 It seems reasonable that, as the list of recommendations grows, the probability of selecting any item decreases. [sent-511, score-0.581]

57 Such a policy will, in effect, tell us what item to recommend given any sequence of user purchases. [sent-521, score-0.592]

58 Moreover, in the web site implementation, it is easy enough to filter out items that were already bought by the user from our list of recommendations. [sent-530, score-0.868]

59 This approximation, although risky in general MDPs, is motivated by the fact that in our initial model, for each state there is a relatively small number of items that are likely to be selected; and the probability of making a transition into an un-encountered state is very low. [sent-559, score-0.562]

60 The number of possible recommendations is nκ , where n is the number of items and κ is the number of items we recommend each time. [sent-569, score-1.059]

61 Recall that we assumed that the probability that a user buys a particular item depends on her current state, the item, and whether or not this item is recommended. [sent-571, score-0.633]

62 The complexity of finding an optimal policy when recommending multiple items at each stage under our assumptions remains the same as the complexity of computing an optimal policy for single item recommendations. [sent-593, score-0.818]

63 In particular, the system does not recommend items that are likely to be bought whether recommended or not, but rather recommends items whose likelihood of being purchased is increased when they are recommended. [sent-595, score-1.025]

64 Nonetheless, when recommendations are based solely on lift, it is possible that many recommendations will be made for which the absolute probability of a purchase (or click) is small. [sent-596, score-0.733]

65 The site need only keep track of the recommendations and the user selections and, say, once a week use those statistics to build a new model and replace it with the old one. [sent-607, score-0.755]

66 • cout (s, r, s · r)—the number of times the user took item r in state s even though it was not recommended, • ctotal (s, s · r)—the number of times a user took item r while being in state s, regardless of whether it was recommended or not. [sent-610, score-1.039]

67 This, in turn, requires that for each state s and for every recommendation r, we observe the response of users to a recommendation of r in state s sufficiently many times. [sent-619, score-0.844]

68 If at each state the system always returns the best recommendations only, then most values for count(s, r, s · r) would be 0, because most items will not appear among the best recommendations. [sent-620, score-0.763]

69 Another possible solution is to show the best recommendation on the top of the list, but show items less likely to be purchased as the second and third items on the list. [sent-629, score-1.046]

70 In our implementation we use a list of three recommendations where the first one is always the optimal one, but the second and third items are selected using the Boltzman distribution without a cutoff. [sent-630, score-0.715]

71 When new items are added, users will start buying them and positive counts for them will appear. [sent-632, score-0.523]

72 1286 A N MDP-BASED R ECOMMENDER S YSTEM insert deleted items into the list of recommended items – an undesirable feature. [sent-642, score-0.795]

73 We compare the performance of our MDP-based recommender system (denoted MDP) with the performance of a recommender system based on our predictive model (denoted MC) as well as other variants. [sent-649, score-1.092]

74 Users received recommendations when adding items to the shopping cart. [sent-659, score-0.677]

75 12 The recommendations were based on the last k items added to the cart ordered by the time they were added. [sent-660, score-0.677]

76 Every time a user was presented with a list of recommendations on either page, the system stored the recommendations that were presented and recorded whether the user purchased a recommended item. [sent-662, score-1.33]

77 As discussed above, a predictive model can answer queries in the form Pr(x|h)—the probability that item x will be purchased given user history h. [sent-672, score-0.691]

78 Users also received recommendations when looking at the description of a book, but these recommendations where based only on the user’s visit to the current page and not on her cart. [sent-678, score-0.676]

79 We compared four policies, where the first policy uses information about the effect of recommendations, and the remaining policies are based on the predictive model solely: • Optimal – recommends items based on optimal policy for the MDP. [sent-684, score-0.82]

80 • Greedy – recommends items that maximize Pr(x|h) · R(x) (where Pr(x|h) is the probability of buying item x given user history h, and R(x) is the value of x to the site – for example, net profit). [sent-685, score-1.019]

81 The length of user session was taken from the learned distribution of user session length in the actual site. [sent-690, score-0.524]

82 Based on the last bit of this cart-id, the user was provided with recommendations by the MDP or MC. [sent-704, score-0.561]

83 7% of the users who made at least one purchase were shown MDP-based recommendations and the other 49. [sent-714, score-0.553]

84 The first item was excluded as it was bought without the benefit of recommendations, and is therefore irrelevant to the comparison between the recommender systems. [sent-717, score-0.703]

85 This is not entirely accurate as the site also provides recommendations for items in the book description page. [sent-723, score-0.878]

86 We used the one-tailed version of the test as the directional hypothesis that the MDP recommender is better than the MC recommender has been theoretically motivated above. [sent-729, score-0.844]

87 In our experiment, the average number of items bought per user session was 6. [sent-736, score-0.677]

88 15), whereas the average price of items was 4% higher in favor of the MDP-based recommender (p = 0. [sent-738, score-0.761]

89 In our second and last experiment, we compared site performance with and without a recommender system. [sent-741, score-0.584]

90 Fortunately, the site owner was willing to remove the recommender system from the site for one week. [sent-744, score-0.78]

91 Thus, we were able to compare average profits per user session during two consecutive weeks – one with recommendations and one without recommendations. [sent-745, score-0.6]

92 18 We found that, when the recommender system was not in use, average site profit dropped 17% (p = 0. [sent-746, score-0.618]

93 Overall, our experiments support the claims concerning the added value of using recommendations in commercial web sites and the validity of the MDP-based model for recommender systems. [sent-749, score-0.928]

94 If recommendation latency is noticeable, no reasonable site administrator will use the recommender system. [sent-757, score-0.875]

95 Table 5 shows the number of recommendations generated per second by the recommender system. [sent-758, score-0.76]

96 We display recommendations between 3/27/2003 and 4/3/2003, and without recommendations from 3/19/2003 to 3/26/2003. [sent-763, score-0.676]

97 Our work presents one of a few examples of commercial systems that use MDPs, and one of the first reports of the performance of commercially deployed recommender system. [sent-781, score-0.543]

98 Our experimental results validate both the utility of recommender systems and the utility of the MDP-based approach to recommender systems. [sent-782, score-0.882]

99 Our predictive model should also make use of relations between items that can be explicitly specified. [sent-799, score-0.519]

100 For example, most sites that sell items have a large catalogue with hierarchical structure such as categories or subjects, a carefully constructed web structure, and item properties such as author name. [sent-800, score-0.606]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('recommender', 0.422), ('items', 0.339), ('recommendations', 0.338), ('recommendation', 0.291), ('mdp', 0.277), ('user', 0.223), ('item', 0.205), ('site', 0.162), ('users', 0.158), ('predictive', 0.148), ('trmdp', 0.132), ('policy', 0.121), ('skipping', 0.113), ('transition', 0.099), ('eckerman', 0.095), ('ecommender', 0.095), ('hani', 0.095), ('rafman', 0.095), ('ystem', 0.079), ('recommended', 0.079), ('bought', 0.076), ('buy', 0.074), ('commercial', 0.074), ('collaborative', 0.07), ('rwd', 0.069), ('mc', 0.068), ('sequential', 0.066), ('zimdars', 0.063), ('reward', 0.059), ('tr', 0.059), ('purchase', 0.057), ('purchased', 0.057), ('mdps', 0.056), ('state', 0.052), ('vi', 0.051), ('pro', 0.05), ('pr', 0.047), ('browsing', 0.044), ('purchases', 0.044), ('trmc', 0.044), ('markov', 0.044), ('states', 0.044), ('recommend', 0.043), ('chain', 0.042), ('count', 0.041), ('ltering', 0.04), ('session', 0.039), ('book', 0.039), ('list', 0.038), ('mitos', 0.038), ('recommends', 0.038), ('system', 0.034), ('predictor', 0.034), ('age', 0.032), ('model', 0.032), ('lift', 0.032), ('recommending', 0.032), ('resnick', 0.032), ('successors', 0.032), ('heckerman', 0.032), ('sites', 0.032), ('rs', 0.031), ('score', 0.03), ('web', 0.03), ('deployed', 0.028), ('reinforcement', 0.028), ('boutilier', 0.026), ('buying', 0.026), ('sarwar', 0.026), ('decay', 0.026), ('action', 0.026), ('history', 0.026), ('gathered', 0.026), ('actions', 0.026), ('xk', 0.026), ('breese', 0.026), ('brafman', 0.025), ('commerce', 0.025), ('selling', 0.025), ('enhancement', 0.023), ('ti', 0.022), ('models', 0.022), ('past', 0.021), ('policies', 0.021), ('mixture', 0.021), ('clustering', 0.021), ('konstan', 0.021), ('transaction', 0.021), ('transitions', 0.021), ('acm', 0.02), ('likely', 0.02), ('books', 0.02), ('utility', 0.019), ('microsoft', 0.019), ('ct', 0.019), ('baux', 0.019), ('bigram', 0.019), ('bookstore', 0.019), ('commercially', 0.019), ('cooking', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 12 jmlr-2005-An MDP-Based Recommender System

Author: Guy Shani, David Heckerman, Ronen I. Brafman

Abstract: Typical recommender systems adopt a static view of the recommendation process and treat it as a prediction problem. We argue that it is more appropriate to view the problem of generating recommendations as a sequential optimization problem and, consequently, that Markov decision processes (MDPs) provide a more appropriate model for recommender systems. MDPs introduce two benefits: they take into account the long-term effects of each recommendation and the expected value of each recommendation. To succeed in practice, an MDP-based recommender system must employ a strong initial model, must be solvable quickly, and should not consume too much memory. In this paper, we describe our particular MDP model, its initialization using a predictive model, the solution and update algorithm, and its actual performance on a commercial site. We also describe the particular predictive model we used which outperforms previous models. Our system is one of a small number of commercially deployed recommender systems. As far as we know, it is the first to report experimental analysis conducted on a real commercial site. These results validate the commercial value of recommender systems, and in particular, of our MDP-based approach. Keywords: recommender systems, Markov decision processes, learning, commercial applications

2 0.13344274 61 jmlr-2005-Prioritization Methods for Accelerating MDP Solvers

Author: David Wingate, Kevin D. Seppi

Abstract: The performance of value and policy iteration can be dramatically improved by eliminating redundant or useless backups, and by backing up states in the right order. We study several methods designed to accelerate these iterative solvers, including prioritization, partitioning, and variable reordering. We generate a family of algorithms by combining several of the methods discussed, and present extensive empirical evidence demonstrating that performance can improve by several orders of magnitude for many problems, while preserving accuracy and convergence guarantees. Keywords: Markov Decision Processes, value iteration, policy iteration, prioritized sweeping, dynamic programming

3 0.074027039 68 jmlr-2005-Tree-Based Batch Mode Reinforcement Learning

Author: Damien Ernst, Pierre Geurts, Louis Wehenkel

Abstract: Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (xt , ut , rt , xt+1 ) where xt denotes the system state at time t, ut the control action taken, rt the instantaneous reward obtained and xt+1 the successor state of the system, and by determining the control policy from this Q-function. The Q-function approximation may be obtained from the limit of a sequence of (batch mode) supervised learning problems. Within this framework we describe the use of several classical tree-based supervised learning methods (CART, Kd-tree, tree bagging) and two newly proposed ensemble algorithms, namely extremely and totally randomized trees. We study their performances on several examples and find that the ensemble methods based on regression trees perform well in extracting relevant information about the optimal control policy from sets of four-tuples. In particular, the totally randomized trees give good results while ensuring the convergence of the sequence, whereas by relaxing the convergence constraint even better accuracy results are provided by the extremely randomized trees. Keywords: batch mode reinforcement learning, regression trees, ensemble methods, supervised learning, fitted value iteration, optimal control

4 0.069555238 14 jmlr-2005-Assessing Approximate Inference for Binary Gaussian Process Classification

Author: Malte Kuss, Carl Edward Rasmussen

Abstract: Gaussian process priors can be used to define flexible, probabilistic classification models. Unfortunately exact Bayesian inference is analytically intractable and various approximation techniques have been proposed. In this work we review and compare Laplace’s method and Expectation Propagation for approximate Bayesian inference in the binary Gaussian process classification model. We present a comprehensive comparison of the approximations, their predictive performance and marginal likelihood estimates to results obtained by MCMC sampling. We explain theoretically and corroborate empirically the advantages of Expectation Propagation compared to Laplace’s method. Keywords: Gaussian process priors, probabilistic classification, Laplace’s approximation, expectation propagation, marginal likelihood, evidence, MCMC

5 0.06297794 4 jmlr-2005-A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

Author: Rie Kubota Ando, Tong Zhang

Abstract: One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semi-supervised learning. Although a number of such methods are proposed, at the current stage, we still don’t have a complete understanding of their effectiveness. This paper investigates a closely related problem, which leads to a novel approach to semi-supervised learning. Specifically we consider learning predictive structures on hypothesis spaces (that is, what kind of classifiers have good predictive power) from multiple learning tasks. We present a general framework in which the structural learning problem can be formulated and analyzed theoretically, and relate it to learning with unlabeled data. Under this framework, algorithms for structural learning will be proposed, and computational issues will be investigated. Experiments will be given to demonstrate the effectiveness of the proposed algorithms in the semi-supervised learning setting.

6 0.060356472 36 jmlr-2005-Gaussian Processes for Ordinal Regression

7 0.045680705 7 jmlr-2005-A Unifying View of Sparse Approximate Gaussian Process Regression

8 0.034217302 2 jmlr-2005-A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior

9 0.030229479 53 jmlr-2005-Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application

10 0.026850283 55 jmlr-2005-Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection

11 0.024651667 19 jmlr-2005-Clustering on the Unit Hypersphere using von Mises-Fisher Distributions

12 0.024425084 43 jmlr-2005-Learning Hidden Variable Networks: The Information Bottleneck Approach

13 0.023298543 31 jmlr-2005-Estimation of Non-Normalized Statistical Models by Score Matching

14 0.023024732 69 jmlr-2005-Tutorial on Practical Prediction Theory for Classification

15 0.022613555 34 jmlr-2005-Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach

16 0.020313304 6 jmlr-2005-A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

17 0.020283977 72 jmlr-2005-What's Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks

18 0.019341903 42 jmlr-2005-Large Margin Methods for Structured and Interdependent Output Variables

19 0.018753564 1 jmlr-2005-A Bayes Optimal Approach for Partitioning the Values of Categorical Attributes

20 0.018107075 8 jmlr-2005-Active Coevolutionary Learning of Deterministic Finite Automata


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.149), (1, -0.023), (2, 0.041), (3, -0.231), (4, 0.132), (5, 0.057), (6, 0.209), (7, -0.076), (8, 0.372), (9, -0.003), (10, -0.084), (11, -0.018), (12, -0.161), (13, 0.026), (14, -0.0), (15, -0.049), (16, -0.036), (17, 0.076), (18, 0.138), (19, 0.028), (20, 0.224), (21, 0.018), (22, 0.136), (23, -0.019), (24, 0.04), (25, -0.056), (26, -0.053), (27, -0.142), (28, 0.119), (29, -0.029), (30, -0.041), (31, -0.023), (32, -0.036), (33, -0.047), (34, -0.041), (35, 0.034), (36, -0.146), (37, -0.202), (38, -0.065), (39, 0.049), (40, 0.041), (41, -0.243), (42, 0.032), (43, -0.014), (44, 0.005), (45, 0.057), (46, 0.001), (47, 0.105), (48, -0.01), (49, -0.062)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97355467 12 jmlr-2005-An MDP-Based Recommender System

Author: Guy Shani, David Heckerman, Ronen I. Brafman

Abstract: Typical recommender systems adopt a static view of the recommendation process and treat it as a prediction problem. We argue that it is more appropriate to view the problem of generating recommendations as a sequential optimization problem and, consequently, that Markov decision processes (MDPs) provide a more appropriate model for recommender systems. MDPs introduce two benefits: they take into account the long-term effects of each recommendation and the expected value of each recommendation. To succeed in practice, an MDP-based recommender system must employ a strong initial model, must be solvable quickly, and should not consume too much memory. In this paper, we describe our particular MDP model, its initialization using a predictive model, the solution and update algorithm, and its actual performance on a commercial site. We also describe the particular predictive model we used which outperforms previous models. Our system is one of a small number of commercially deployed recommender systems. As far as we know, it is the first to report experimental analysis conducted on a real commercial site. These results validate the commercial value of recommender systems, and in particular, of our MDP-based approach. Keywords: recommender systems, Markov decision processes, learning, commercial applications

2 0.66876918 61 jmlr-2005-Prioritization Methods for Accelerating MDP Solvers

Author: David Wingate, Kevin D. Seppi

Abstract: The performance of value and policy iteration can be dramatically improved by eliminating redundant or useless backups, and by backing up states in the right order. We study several methods designed to accelerate these iterative solvers, including prioritization, partitioning, and variable reordering. We generate a family of algorithms by combining several of the methods discussed, and present extensive empirical evidence demonstrating that performance can improve by several orders of magnitude for many problems, while preserving accuracy and convergence guarantees. Keywords: Markov Decision Processes, value iteration, policy iteration, prioritized sweeping, dynamic programming

3 0.26941577 4 jmlr-2005-A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

Author: Rie Kubota Ando, Tong Zhang

Abstract: One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semi-supervised learning. Although a number of such methods are proposed, at the current stage, we still don’t have a complete understanding of their effectiveness. This paper investigates a closely related problem, which leads to a novel approach to semi-supervised learning. Specifically we consider learning predictive structures on hypothesis spaces (that is, what kind of classifiers have good predictive power) from multiple learning tasks. We present a general framework in which the structural learning problem can be formulated and analyzed theoretically, and relate it to learning with unlabeled data. Under this framework, algorithms for structural learning will be proposed, and computational issues will be investigated. Experiments will be given to demonstrate the effectiveness of the proposed algorithms in the semi-supervised learning setting.

4 0.26046863 68 jmlr-2005-Tree-Based Batch Mode Reinforcement Learning

Author: Damien Ernst, Pierre Geurts, Louis Wehenkel

Abstract: Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (xt , ut , rt , xt+1 ) where xt denotes the system state at time t, ut the control action taken, rt the instantaneous reward obtained and xt+1 the successor state of the system, and by determining the control policy from this Q-function. The Q-function approximation may be obtained from the limit of a sequence of (batch mode) supervised learning problems. Within this framework we describe the use of several classical tree-based supervised learning methods (CART, Kd-tree, tree bagging) and two newly proposed ensemble algorithms, namely extremely and totally randomized trees. We study their performances on several examples and find that the ensemble methods based on regression trees perform well in extracting relevant information about the optimal control policy from sets of four-tuples. In particular, the totally randomized trees give good results while ensuring the convergence of the sequence, whereas by relaxing the convergence constraint even better accuracy results are provided by the extremely randomized trees. Keywords: batch mode reinforcement learning, regression trees, ensemble methods, supervised learning, fitted value iteration, optimal control

5 0.21762753 36 jmlr-2005-Gaussian Processes for Ordinal Regression

Author: Wei Chu, Zoubin Ghahramani

Abstract: We present a probabilistic kernel approach to ordinal regression based on Gaussian processes. A threshold model that generalizes the probit function is used as the likelihood function for ordinal variables. Two inference techniques, based on the Laplace approximation and the expectation propagation algorithm respectively, are derived for hyperparameter learning and model selection. We compare these two Gaussian process approaches with a previous ordinal regression method based on support vector machines on some benchmark and real-world data sets, including applications of ordinal regression to collaborative filtering and gene expression analysis. Experimental results on these data sets verify the usefulness of our approach. Keywords: Gaussian processes, ordinal regression, approximate Bayesian inference, collaborative filtering, gene expression analysis, feature selection

6 0.1839419 14 jmlr-2005-Assessing Approximate Inference for Binary Gaussian Process Classification

7 0.12006399 7 jmlr-2005-A Unifying View of Sparse Approximate Gaussian Process Regression

8 0.11869357 43 jmlr-2005-Learning Hidden Variable Networks: The Information Bottleneck Approach

9 0.11557572 55 jmlr-2005-Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection

10 0.11117817 6 jmlr-2005-A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

11 0.1043208 8 jmlr-2005-Active Coevolutionary Learning of Deterministic Finite Automata

12 0.1011474 10 jmlr-2005-Adaptive Online Prediction by Following the Perturbed Leader

13 0.094137534 2 jmlr-2005-A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior

14 0.089252934 64 jmlr-2005-Semigroup Kernels on Measures

15 0.08792083 48 jmlr-2005-Learning the Kernel Function via Regularization

16 0.087838948 25 jmlr-2005-Denoising Source Separation

17 0.087565586 40 jmlr-2005-Inner Product Spaces for Bayesian Networks

18 0.083746612 72 jmlr-2005-What's Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks

19 0.083047278 53 jmlr-2005-Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application

20 0.081798829 19 jmlr-2005-Clustering on the Unit Hypersphere using von Mises-Fisher Distributions


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.011), (17, 0.015), (19, 0.018), (36, 0.013), (37, 0.061), (42, 0.015), (43, 0.033), (47, 0.013), (52, 0.62), (59, 0.018), (68, 0.013), (70, 0.02), (88, 0.048), (90, 0.011), (94, 0.011)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98235565 12 jmlr-2005-An MDP-Based Recommender System

Author: Guy Shani, David Heckerman, Ronen I. Brafman

Abstract: Typical recommender systems adopt a static view of the recommendation process and treat it as a prediction problem. We argue that it is more appropriate to view the problem of generating recommendations as a sequential optimization problem and, consequently, that Markov decision processes (MDPs) provide a more appropriate model for recommender systems. MDPs introduce two benefits: they take into account the long-term effects of each recommendation and the expected value of each recommendation. To succeed in practice, an MDP-based recommender system must employ a strong initial model, must be solvable quickly, and should not consume too much memory. In this paper, we describe our particular MDP model, its initialization using a predictive model, the solution and update algorithm, and its actual performance on a commercial site. We also describe the particular predictive model we used which outperforms previous models. Our system is one of a small number of commercially deployed recommender systems. As far as we know, it is the first to report experimental analysis conducted on a real commercial site. These results validate the commercial value of recommender systems, and in particular, of our MDP-based approach. Keywords: recommender systems, Markov decision processes, learning, commercial applications

2 0.97824711 50 jmlr-2005-Learning with Decision Lists of Data-Dependent Features

Author: Mario Marchand, Marina Sokolova

Abstract: We present a learning algorithm for decision lists which allows features that are constructed from the data and allows a trade-off between accuracy and complexity. We provide bounds on the generalization error of this learning algorithm in terms of the number of errors and the size of the classifier it finds on the training data. We also compare its performance on some natural data sets with the set covering machine and the support vector machine. Furthermore, we show that the proposed bounds on the generalization error provide effective guides for model selection. Keywords: decision list machines, set covering machines, sparsity, data-dependent features, sample compression, model selection, learning theory

3 0.97186548 34 jmlr-2005-Feature Selection for Unsupervised and Supervised Inference: The Emergence of Sparsity in a Weight-Based Approach

Author: Lior Wolf, Amnon Shashua

Abstract: The problem of selecting a subset of relevant features in a potentially overwhelming quantity of data is classic and found in many branches of science. Examples in computer vision, text processing and more recently bio-informatics are abundant. In text classification tasks, for example, it is not uncommon to have 104 to 107 features of the size of the vocabulary containing word frequency counts, with the expectation that only a small fraction of them are relevant. Typical examples include the automatic sorting of URLs into a web directory and the detection of spam email. In this work we present a definition of “relevancy” based on spectral properties of the Laplacian of the features’ measurement matrix. The feature selection process is then based on a continuous ranking of the features defined by a least-squares optimization process. A remarkable property of the feature relevance function is that sparse solutions for the ranking values naturally emerge as a result of a “biased non-negativity” of a key matrix in the process. As a result, a simple leastsquares optimization process converges onto a sparse solution, i.e., a selection of a subset of features which form a local maximum over the relevance function. The feature selection algorithm can be embedded in both unsupervised and supervised inference problems and empirical evidence show that the feature selections typically achieve high accuracy even when only a small fraction of the features are relevant.

4 0.91989082 9 jmlr-2005-Active Learning to Recognize Multiple Types of Plankton

Author: Tong Luo, Kurt Kramer, Dmitry B. Goldgof, Lawrence O. Hall, Scott Samson, Andrew Remsen, Thomas Hopkins

Abstract: This paper presents an active learning method which reduces the labeling effort of domain experts in multi-class classification problems. Active learning is applied in conjunction with support vector machines to recognize underwater zooplankton from higher-resolution, new generation SIPPER II images. Most previous work on active learning with support vector machines only deals with two class problems. In this paper, we propose an active learning approach “breaking ties” for multiclass support vector machines using the one-vs-one approach with a probability approximation. Experimental results indicate that our approach often requires significantly less labeled images to reach a given accuracy than the approach of labeling the least certain test example and random sampling. It can also be applied in batch mode resulting in an accuracy comparable to labeling one image at a time and retraining. Keywords: active learning, support vector machine, plankton recognition, probabilistic output, multi-class support vector machine

5 0.78366661 24 jmlr-2005-Core Vector Machines: Fast SVM Training on Very Large Data Sets

Author: Ivor W. Tsang, James T. Kwok, Pak-Ming Cheung

Abstract: O(m3 ) Standard SVM training has time and O(m2 ) space complexities, where m is the training set size. It is thus computationally infeasible on very large data sets. By observing that practical SVM implementations only approximate the optimal solution by an iterative strategy, we scale up kernel methods by exploiting such “approximateness” in this paper. We first show that many kernel methods can be equivalently formulated as minimum enclosing ball (MEB) problems in computational geometry. Then, by adopting an efficient approximate MEB algorithm, we obtain provably approximately optimal solutions with the idea of core sets. Our proposed Core Vector Machine (CVM) algorithm can be used with nonlinear kernels and has a time complexity that is linear in m and a space complexity that is independent of m. Experiments on large toy and realworld data sets demonstrate that the CVM is as accurate as existing SVM implementations, but is much faster and can handle much larger data sets than existing scale-up methods. For example, CVM with the Gaussian kernel produces superior results on the KDDCUP-99 intrusion detection data, which has about five million training patterns, in only 1.4 seconds on a 3.2GHz Pentium–4 PC. Keywords: kernel methods, approximation algorithm, minimum enclosing ball, core set, scalability

6 0.78070194 33 jmlr-2005-Fast Kernel Classifiers with Online and Active Learning

7 0.74530876 8 jmlr-2005-Active Coevolutionary Learning of Deterministic Finite Automata

8 0.7150839 46 jmlr-2005-Learning a Mahalanobis Metric from Equivalence Constraints

9 0.70090288 36 jmlr-2005-Gaussian Processes for Ordinal Regression

10 0.6901682 19 jmlr-2005-Clustering on the Unit Hypersphere using von Mises-Fisher Distributions

11 0.66595703 45 jmlr-2005-Learning Multiple Tasks with Kernel Methods

12 0.65455586 69 jmlr-2005-Tutorial on Practical Prediction Theory for Classification

13 0.64694923 56 jmlr-2005-Maximum Margin Algorithms with Boolean Kernels

14 0.64100641 65 jmlr-2005-Separating a Real-Life Nonlinear Image Mixture

15 0.62704426 18 jmlr-2005-Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems

16 0.61709511 54 jmlr-2005-Managing Diversity in Regression Ensembles

17 0.61659461 49 jmlr-2005-Learning the Kernel with Hyperkernels     (Kernel Machines Section)

18 0.61579686 25 jmlr-2005-Denoising Source Separation

19 0.61508173 39 jmlr-2005-Information Bottleneck for Gaussian Variables

20 0.60723072 68 jmlr-2005-Tree-Based Batch Mode Reinforcement Learning